[jira] [Work logged] (BEAM-7962) Drop support for Flink 1.5 and 1.6
[ https://issues.apache.org/jira/browse/BEAM-7962?focusedWorklogId=316031&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-316031 ] ASF GitHub Bot logged work on BEAM-7962: Author: ASF GitHub Bot Created on: 21/Sep/19 02:00 Start Date: 21/Sep/19 02:00 Worklog Time Spent: 10m Work Description: mxm commented on pull request #9632: [BEAM-7962] Drop support for Flink 1.5 and 1.6 URL: https://github.com/apache/beam/pull/9632 Flink 1.9 is now released and Beam 2.17.0 is going to support it. Since the Flink community only supports the last two Flink releases, it is now time to drop at least 1.5 and 1.6. As a follow-up, we will get rid of Flink 1.5/1.6 specific workarounds, e.g. make use of Flink's preSnapshotBarrier in AbstractStreamOperator which removes the needs to buffer elements during a snapshot. Build time should decrease by several minutes. Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommi
[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.
[ https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=316030&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-316030 ] ASF GitHub Bot logged work on BEAM-5428: Author: ASF GitHub Bot Created on: 21/Sep/19 01:29 Start Date: 21/Sep/19 01:29 Worklog Time Spent: 10m Work Description: mxm commented on issue #9374: [BEAM-5428] Implement Runner support for cache tokens URL: https://github.com/apache/beam/pull/9374#issuecomment-527461152 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 316030) Time Spent: 18h 20m (was: 18h 10m) > Implement cross-bundle state caching. > - > > Key: BEAM-5428 > URL: https://issues.apache.org/jira/browse/BEAM-5428 > Project: Beam > Issue Type: Improvement > Components: sdk-py-harness >Reporter: Robert Bradshaw >Assignee: Rakesh Kumar >Priority: Major > Time Spent: 18h 20m > Remaining Estimate: 0h > > Tech spec: > [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m] > Relevant document: > [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit] > Mailing list link: > [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-8227) Move private IP dependency validation test into a Beam project test
[ https://issues.apache.org/jira/browse/BEAM-8227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16934856#comment-16934856 ] Ahmet Altay commented on BEAM-8227: --- This would ensure that container has all the matching dependencies as required by the SDK. Since private IP flag will ensure that containers cannot download additional dependencies at runtime. It can be considered as part of the custom containers project. > Move private IP dependency validation test into a Beam project test > --- > > Key: BEAM-8227 > URL: https://issues.apache.org/jira/browse/BEAM-8227 > Project: Beam > Issue Type: Improvement > Components: testing >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > > Move private IP dependency validation test into a Beam project test (rather > than internal Dataflow) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8275) Beam SQL should support BigQuery in DIRECT_READ mode
[ https://issues.apache.org/jira/browse/BEAM-8275?focusedWorklogId=316014&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-316014 ] ASF GitHub Bot logged work on BEAM-8275: Author: ASF GitHub Bot Created on: 21/Sep/19 00:54 Start Date: 21/Sep/19 00:54 Worklog Time Spent: 10m Work Description: amaliujia commented on issue #9625: [BEAM-8275] Beam SQL should support BigQuery in DIRECT_READ mode URL: https://github.com/apache/beam/pull/9625#issuecomment-533754059 Run SQL Postcommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 316014) Time Spent: 50m (was: 40m) > Beam SQL should support BigQuery in DIRECT_READ mode > > > Key: BEAM-8275 > URL: https://issues.apache.org/jira/browse/BEAM-8275 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > SQL currently only supports reading from BigQuery in DEFAULT (EXPORT) mode. > We also need to support DIRECT_READ mode. The method should be configurable > by TBLPROPERTIES through the SQL CLI. This will enable us to take advantage > of the DIRECT_READ features for filter and project push down. > References: > [https://beam.apache.org/documentation/io/built-in/google-bigquery/#storage-api] > [https://beam.apache.org/blog/2019/06/04/adding-data-sources-to-sql.html] > [https://github.com/apache/beam/blob/c2f0d282337f3ae0196a7717712396a5a41fdde1/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/bigquery/BigQueryTable.java] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-6855) Side inputs are not supported when using the state API
[ https://issues.apache.org/jira/browse/BEAM-6855?focusedWorklogId=316012&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-316012 ] ASF GitHub Bot logged work on BEAM-6855: Author: ASF GitHub Bot Created on: 21/Sep/19 00:50 Start Date: 21/Sep/19 00:50 Worklog Time Spent: 10m Work Description: reuvenlax commented on pull request #9612: [BEAM-6855] Side inputs are not supported when using the state API URL: https://github.com/apache/beam/pull/9612#discussion_r326839653 ## File path: sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ParDoTest.java ## @@ -2294,6 +2294,47 @@ public void processElement( .containsInAnyOrder(Lists.newArrayList(12, 42, 84, 97), Lists.newArrayList(0, 1, 2)); pipeline.run(); } + +@Test +@Category({ValidatesRunner.class, UsesStatefulParDo.class, UsesSideInputs.class}) +public void testStateSideInput() { + final PCollectionView sideInput = + pipeline + .apply("CreateSideInput1", Create.of(2)) + .apply("ViewSideInput1", View.asSingleton()); + + TestSimpleStatefulDoFn fn = new TestSimpleStatefulDoFn(sideInput); + pipeline.apply(Create.of(KV.of(1, 2))).apply(ParDo.of(fn).withSideInputs(sideInput)); + + pipeline.run(); +} + +private static class TestSimpleStatefulDoFn extends DoFn, Integer> { + + private final PCollectionView view; + + @StateId("foo") + private final StateSpec> spec = StateSpecs.value(VarIntCoder.of()); + + private TestSimpleStatefulDoFn(PCollectionView view) { +this.view = view; + } + + @ProcessElement + public void processElem(ProcessContext c) { +// noop Review comment: can you add code here to access the state and the side input? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 316012) Time Spent: 5h 20m (was: 5h 10m) > Side inputs are not supported when using the state API > -- > > Key: BEAM-6855 > URL: https://issues.apache.org/jira/browse/BEAM-6855 > Project: Beam > Issue Type: Bug > Components: runner-core, runner-dataflow, runner-direct >Reporter: Reuven Lax >Assignee: Shehzaad Nakhoda >Priority: Major > Time Spent: 5h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7746) Add type hints to python code
[ https://issues.apache.org/jira/browse/BEAM-7746?focusedWorklogId=316010&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-316010 ] ASF GitHub Bot logged work on BEAM-7746: Author: ASF GitHub Bot Created on: 21/Sep/19 00:47 Start Date: 21/Sep/19 00:47 Worklog Time Spent: 10m Work Description: chadrik commented on issue #9056: [BEAM-7746] Add python type hints URL: https://github.com/apache/beam/pull/9056#issuecomment-533752642 This PR is getting close. It represents an enormous amount of effort, and spans a substantial part of the code base, so I'd like to start progressing towards getting this merged. Rebasing against master has been pretty painless so far, but I'm afraid that could change soon. I've broken the PR up into bite-sized commits to help tell a story about each set of changes. I've also dropped the more complex changes -- the mypy plugin and generics -- and I'll deal with those in a future PR. Other than questions and comments you might have about the content of the PR, the main issues that I need input on are primarily about style. ## Line length It's very difficult to keep type comments to 80 characters. Not only is there more information to describe, the type comments themselves cannot span more than one line. This will improve when we get to python 3.6 and annotations are an official part of the syntax, since they can be defined over multiple lines just like normal python objects (which they are). There are a handful of practices that will minimize our line length, but even in concert they won't work 100% of the time. Here are the main ones: - Use `from typing import Foo` vs `import typing`. This is the single most impactful thing we can do. It also greatly improves legibility. Compare these two options: - `typing.Optional[typing.Tuple[typing.Dict[str, str], float]]` - `Optional[Tuple[Dict[str, str], float]]` - Use type aliases. e.g. `AwesomeType = Optional[Tuple[Dict[str, str], float]]`. I prefer to use this sparingly, only when there's a complex type shared in many places. Here's why: - quite often we can use duck-typing to reduce the requirement for certain functions. e.g. there might be a function where `Optional[Tuple[Mapping[str, str], typing.SupportsFloat]]` would do instead of `AwesomeType`. - I don't like to have to constantly refer to another location to see the type - Change the way that we style functions so that they provide more room for annotations. e.g. consider these options: ```python def really_long_function_name(arg1, # type: Optional[Tuple[Dict[str, str], float]] arg2, # type: int ): # type: (...) -> Tuple[str, float] code ``` ```python def really_long_function_name( arg1, # type: Optional[Tuple[Dict[str, str], float]] arg2, # type: int ): # type: (...) -> Tuple[str, float] code ``` The beam code seems to favor the former over the latter, though I see both present. We should decide what our policy will be. In this PR, I've determined the style on a case-by-case basis, mostly favoring the former. When all else fails we can use `# pylint: disable=line-too-long` *after* the type comment. I've added these to `apache_beam.pipeline` so you can see an example. It's a larger conversation, but it might be worth discussing increasing the line length. Many type-annotated projects have increased their line length to 99 or more characters. This is a big change, that would involve a lot of debate. ## Unused module imports Pylint is not able to properly track type annotations used within type comments (which is the majority), and so generates spurious errors about unused imports for most of the typing classes. Newer versions of pylint can track annotations within comments, but only for variable annotations and not for function annotations, so it's not a complete solution. Even if we think there is a benefit to a partial solution, it will take some work to get to the newer version of pylint because it's python3-only. The solution I'm proposing for now is to simply ignore the problem, by using the following pattern when importing types: ```python # pylint: disable=unused-import from typing import TYPE_CHECKING from typing import Any from typing import Callable from typing import Container from typing import DefaultDict from typing import Dict from typing import Iterable from typing import Iterator from typing import List # pylint: enable=unused-import ``` That will leave it up to developers to get right for now, and when we get to a pure python3 code-base
[jira] [Work logged] (BEAM-7746) Add type hints to python code
[ https://issues.apache.org/jira/browse/BEAM-7746?focusedWorklogId=316009&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-316009 ] ASF GitHub Bot logged work on BEAM-7746: Author: ASF GitHub Bot Created on: 21/Sep/19 00:39 Start Date: 21/Sep/19 00:39 Worklog Time Spent: 10m Work Description: chadrik commented on issue #9056: [BEAM-7746] Add python type hints URL: https://github.com/apache/beam/pull/9056#issuecomment-533752642 This PR is getting close. It represents an enormous amount of effort, and spans a substantial part of the code base, so I'd like to start progressing towards getting this merged. Rebasing against master has been pretty painless so far, but I'm afraid that could change soon. I've broken the PR up into bite-sized commits to help tell a story about each set of changes. I've also dropped the more complex changes -- the mypy plugin and generics -- and I'll deal with those in a future PR. Other than questions and comments you might have about the content of the PR, the main issues that I need input on are primarily about style. ## Line length It's very difficult to keep type comments to 80 characters. Not only is there more information to describe, the type comments themselves cannot span more than one line. This will improve when we get to python 3.6 and annotations are an official part of the syntax, since they can be defined over multiple lines just like normal python objects (which they are). There are a handful of practices that will minimize our line length, but even in concert they won't work 100% of the time. Here are the main ones: - Use `from typing import Foo` vs `import typing`. This is the single most impactful thing we can do. It also greatly improves legibility. Compare these two options: - `typing.Optional[typing.Tuple[typing.Dict[str, str], float]]` - `Optional[Tuple[Dict[str, str], float]]` - Use type aliases. e.g. `AwesomeType = Optional[Tuple[Dict[str, str], float]]`. I prefer to use this sparingly, only when there's a complex type shared in many places. Here's why: - quite often we can use duck-typing to reduce the requirement for certain functions. e.g. there might be a function where `Optional[Tuple[Mapping[str, str], typing.SupportsFloat]]` would do instead of `AwesomeType`. - I don't like to have to constantly refer to another location to see the type - Change the way that we style functions so that they provide more room for annotations. e.g. consider these options: ```python def really_long_function_name(arg1, # type: Optional[Tuple[Dict[str, str], float]] arg2, # type: int ): # type: (...) -> Tuple[str, float] code ``` ```python def really_long_function_name( arg1, # type: Optional[Tuple[Dict[str, str], float]] arg2, # type: int ): # type: (...) -> Tuple[str, float] code ``` The beam code seems to favor the former over the latter, though I see both present. We should decide what our policy will be. In this PR, I've determined the style on a case-by-case basis, mostly favoring the former. When all else fails we can use `# pylint: disable=line-too-long` *after* the type comment. I've added these to `apache_beam.pipeline` so you can see an example. It's a larger conversation, but it might be worth discussing increasing the line length. Many type-annotated projects have increased their line length to 99 or more characters. This is a big change, that would involve a lot of debate. ## Unused module imports Pylint is not able to properly track type annotations used within type comments (which is the majority), and so generates spurious errors about unused imports for most of the typing classes. Newer versions of pylint can track annotations within comments, but only for variable annotations and not for function annotations, so it's not a complete solution. Even if we think there is a benefit to a partial solution, it will take some work to get to the newer version of pylint because it's python3-only. The solution I'm proposing for now is to simply ignore the problem, by using the following pattern when importing types: ``` # pylint: disable=unused-import from typing import TYPE_CHECKING from typing import Any from typing import Callable from typing import Container from typing import DefaultDict from typing import Dict from typing import Iterable from typing import Iterator from typing import List # pylint: enable=unused-import ``` That will leave it up to developers to get right for now, and when we get to a pure python3 code-base we ca
[jira] [Work logged] (BEAM-8146) SchemaCoder/RowCoder have no equals() function
[ https://issues.apache.org/jira/browse/BEAM-8146?focusedWorklogId=316008&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-316008 ] ASF GitHub Bot logged work on BEAM-8146: Author: ASF GitHub Bot Created on: 21/Sep/19 00:38 Start Date: 21/Sep/19 00:38 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on issue #9493: [BEAM-8146] Add equals and hashCode to SchemaCoder and RowCoder URL: https://github.com/apache/beam/pull/9493#issuecomment-533752600 Run Apex ValidatesRunner This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 316008) Time Spent: 0.5h (was: 20m) > SchemaCoder/RowCoder have no equals() function > -- > > Key: BEAM-8146 > URL: https://issues.apache.org/jira/browse/BEAM-8146 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.15.0 >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > SchemaCoder has no equals function, so it can't be compared in tests, like > CloudComponentsTests$DefaultCoders, which is being re-enabled in > https://github.com/apache/beam/pull/9446 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8146) SchemaCoder/RowCoder have no equals() function
[ https://issues.apache.org/jira/browse/BEAM-8146?focusedWorklogId=316007&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-316007 ] ASF GitHub Bot logged work on BEAM-8146: Author: ASF GitHub Bot Created on: 21/Sep/19 00:38 Start Date: 21/Sep/19 00:38 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on issue #9493: [BEAM-8146] Add equals and hashCode to SchemaCoder and RowCoder URL: https://github.com/apache/beam/pull/9493#issuecomment-533752586 Run Flink ValidatesRunner This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 316007) Time Spent: 20m (was: 10m) > SchemaCoder/RowCoder have no equals() function > -- > > Key: BEAM-8146 > URL: https://issues.apache.org/jira/browse/BEAM-8146 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.15.0 >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > SchemaCoder has no equals function, so it can't be compared in tests, like > CloudComponentsTests$DefaultCoders, which is being re-enabled in > https://github.com/apache/beam/pull/9446 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-8296) Containerize the Spark job server
Kyle Weaver created BEAM-8296: - Summary: Containerize the Spark job server Key: BEAM-8296 URL: https://issues.apache.org/jira/browse/BEAM-8296 Project: Beam Issue Type: Improvement Components: runner-spark Reporter: Kyle Weaver Assignee: Kyle Weaver a la [https://github.com/apache/beam/blob/master/runners/flink/job-server-container/flink_job_server_container.gradle] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8295) Python streaming doc changes
[ https://issues.apache.org/jira/browse/BEAM-8295?focusedWorklogId=315995&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315995 ] ASF GitHub Bot logged work on BEAM-8295: Author: ASF GitHub Bot Created on: 20/Sep/19 23:29 Start Date: 20/Sep/19 23:29 Worklog Time Spent: 10m Work Description: rosetn commented on issue #9630: [BEAM-8295] Python streaming doc changes URL: https://github.com/apache/beam/pull/9630#issuecomment-533743643 Pulled in too many changes, closing this PR This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315995) Time Spent: 20m (was: 10m) > Python streaming doc changes > > > Key: BEAM-8295 > URL: https://issues.apache.org/jira/browse/BEAM-8295 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Rose Nguyen >Assignee: Rose Nguyen >Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > > Python streaming doc changes for the 2.16 release -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8295) Python streaming doc changes
[ https://issues.apache.org/jira/browse/BEAM-8295?focusedWorklogId=315996&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315996 ] ASF GitHub Bot logged work on BEAM-8295: Author: ASF GitHub Bot Created on: 20/Sep/19 23:29 Start Date: 20/Sep/19 23:29 Worklog Time Spent: 10m Work Description: rosetn commented on pull request #9630: [BEAM-8295] Python streaming doc changes URL: https://github.com/apache/beam/pull/9630 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315996) Time Spent: 0.5h (was: 20m) > Python streaming doc changes > > > Key: BEAM-8295 > URL: https://issues.apache.org/jira/browse/BEAM-8295 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Rose Nguyen >Assignee: Rose Nguyen >Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > > Python streaming doc changes for the 2.16 release -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8295) Python streaming doc changes
[ https://issues.apache.org/jira/browse/BEAM-8295?focusedWorklogId=315994&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315994 ] ASF GitHub Bot logged work on BEAM-8295: Author: ASF GitHub Bot Created on: 20/Sep/19 23:28 Start Date: 20/Sep/19 23:28 Worklog Time Spent: 10m Work Description: rosetn commented on pull request #9630: [BEAM-8295] Python streaming doc changes URL: https://github.com/apache/beam/pull/9630 Python streaming doc changes for 2.16 release Removes limitations from "experimental" status Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://build
[jira] [Created] (BEAM-8295) Python streaming doc changes
Rose Nguyen created BEAM-8295: - Summary: Python streaming doc changes Key: BEAM-8295 URL: https://issues.apache.org/jira/browse/BEAM-8295 Project: Beam Issue Type: Improvement Components: website Reporter: Rose Nguyen Assignee: Rose Nguyen Python streaming doc changes for the 2.16 release -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-8289) Failing the job when the temporary file is accidentally deleted by another job in FileBasedSink
[ https://issues.apache.org/jira/browse/BEAM-8289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Heejong Lee updated BEAM-8289: -- Description: There was an issue that a temporary directory name is not unique in FileBasedSink (https://issues.apache.org/jira/browse/BEAM-7689) and we've changed the naming algorithm from timestamp to UUID. However, it looks like UUID is still not enough to distinguish two separate temporary directories from different jobs. We may need to fail the job when the temporary file is accidentally deleted by another job. (was: There was an issue that a temporary directory name is not unique in FileBasedSink (https://issues.apache.org/jira/browse/BEAM-7689) and we've changed the naming algorithm from timestamp to UUID. However, it looks like UUID is still not enough to distinguish two separate temporary directories from different jobs. We may need to put job name or job id to the temporary directory name for avoiding future naming collision.) > Failing the job when the temporary file is accidentally deleted by another > job in FileBasedSink > --- > > Key: BEAM-8289 > URL: https://issues.apache.org/jira/browse/BEAM-8289 > Project: Beam > Issue Type: Bug > Components: io-java-files >Reporter: Heejong Lee >Assignee: Heejong Lee >Priority: Major > > There was an issue that a temporary directory name is not unique in > FileBasedSink (https://issues.apache.org/jira/browse/BEAM-7689) and we've > changed the naming algorithm from timestamp to UUID. However, it looks like > UUID is still not enough to distinguish two separate temporary directories > from different jobs. We may need to fail the job when the temporary file is > accidentally deleted by another job. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-8289) Failing the job when the temporary file is accidentally deleted by another job in FileBasedSink
[ https://issues.apache.org/jira/browse/BEAM-8289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Heejong Lee updated BEAM-8289: -- Summary: Failing the job when the temporary file is accidentally deleted by another job in FileBasedSink (was: Append job name (or job id) to temporary directory name in FileBasedSink) > Failing the job when the temporary file is accidentally deleted by another > job in FileBasedSink > --- > > Key: BEAM-8289 > URL: https://issues.apache.org/jira/browse/BEAM-8289 > Project: Beam > Issue Type: Bug > Components: io-java-files >Reporter: Heejong Lee >Assignee: Heejong Lee >Priority: Major > > There was an issue that a temporary directory name is not unique in > FileBasedSink (https://issues.apache.org/jira/browse/BEAM-7689) and we've > changed the naming algorithm from timestamp to UUID. However, it looks like > UUID is still not enough to distinguish two separate temporary directories > from different jobs. We may need to put job name or job id to the temporary > directory name for avoiding future naming collision. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-8294) Spark portable validates runner tests timing out
[ https://issues.apache.org/jira/browse/BEAM-8294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kyle Weaver updated BEAM-8294: -- Description: This postcommit has been timing out for 11 days. [https://github.com/apache/beam/pull/9095] has been merged for about 11 days. Coincidence? I think NOT! .. .Seriously, though, I wonder what about the SDK worker management stack caused this to slow down. [https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/] was: This postcommit has been timing out for 11 days. [https://github.com/apache/beam/pull/9095] has been merged for about 11 days. Coincidence? I think NOT! .. .Seriously, though, I wonder what about the SDK worker management stack caused this to slow down. [https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/] [!image-2019-09-20-16-13-58-946.png!|https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/] > Spark portable validates runner tests timing out > > > Key: BEAM-8294 > URL: https://issues.apache.org/jira/browse/BEAM-8294 > Project: Beam > Issue Type: Improvement > Components: runner-spark, testing >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Labels: portability-spark > > This postcommit has been timing out for 11 days. > [https://github.com/apache/beam/pull/9095] has been merged for about 11 days. > Coincidence? I think NOT! .. .Seriously, though, I wonder what about the SDK > worker management stack caused this to slow down. > [https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/] > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-8294) Spark portable validates runner tests timing out
Kyle Weaver created BEAM-8294: - Summary: Spark portable validates runner tests timing out Key: BEAM-8294 URL: https://issues.apache.org/jira/browse/BEAM-8294 Project: Beam Issue Type: Improvement Components: runner-spark, testing Reporter: Kyle Weaver Assignee: Kyle Weaver This postcommit has been timing out for 11 days. [https://github.com/apache/beam/pull/9095] has been merged for about 11 days. Coincidence? I think NOT! .. .Seriously, though, I wonder what about the SDK worker management stack caused this to slow down. [https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/] [!image-2019-09-20-16-13-58-946.png!|https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8240) Fix pipeline proto to contain worker_harness_container_image override
[ https://issues.apache.org/jira/browse/BEAM-8240?focusedWorklogId=315989&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315989 ] ASF GitHub Bot logged work on BEAM-8240: Author: ASF GitHub Bot Created on: 20/Sep/19 23:12 Start Date: 20/Sep/19 23:12 Worklog Time Spent: 10m Work Description: chamikaramj commented on pull request #9629: [BEAM-8240] Sets workerHarnessContaienrImage in the default Environment of DataflowRunner URL: https://github.com/apache/beam/pull/9629 Sets workerHarnessContaienrImage as the containerImage of the DockerPayload of the default environment for DataflowRunner This is similar to change https://github.com/apache/beam/pull/9583 to Python SDK. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.
[jira] [Work logged] (BEAM-8240) Fix pipeline proto to contain worker_harness_container_image override
[ https://issues.apache.org/jira/browse/BEAM-8240?focusedWorklogId=315990&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315990 ] ASF GitHub Bot logged work on BEAM-8240: Author: ASF GitHub Bot Created on: 20/Sep/19 23:12 Start Date: 20/Sep/19 23:12 Worklog Time Spent: 10m Work Description: chamikaramj commented on issue #9629: [BEAM-8240] Sets workerHarnessContaienrImage in the default Environment of DataflowRunner URL: https://github.com/apache/beam/pull/9629#issuecomment-533740382 R: @lukecwik This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315990) Time Spent: 3h 50m (was: 3h 40m) > Fix pipeline proto to contain worker_harness_container_image override > - > > Key: BEAM-8240 > URL: https://issues.apache.org/jira/browse/BEAM-8240 > Project: Beam > Issue Type: Bug > Components: sdk-py-harness >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: Minor > Fix For: 2.17.0 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > SDK harness incorrectly identifies itself when using custom SDK container > within environment field when building pipeline proto. > > Passing in the experiment *worker_harness_container_image=YYY* doesn't > override the pipeline proto environment field and it is still being populated > with *gcr.io/cloud-dataflow/v1beta3/python-fnapi:beam-master-20190802* > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-7688) Flink portable runner gets stuck when waiting for SDK Harness to close
[ https://issues.apache.org/jira/browse/BEAM-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kyle Weaver updated BEAM-7688: -- Issue Type: Bug (was: Improvement) > Flink portable runner gets stuck when waiting for SDK Harness to close > -- > > Key: BEAM-7688 > URL: https://issues.apache.org/jira/browse/BEAM-7688 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > > When parallelism = nproc: > "MapPartition (MapPartition at [37]{Analyze, RandomizeData, ReadFromText, > DecodeForAnalyze}) (9/12)" #2855 prio=5 os_prio=0 tid=0x7f9184022800 > nid=0x2b58 waiting on condition [0x7f9091592000] > java.lang.Thread.State: WAITING (parking) > at (C/C++) 0x7f926a97a9f2 (Unknown Source) > at (C/C++) 0x7f9269f1dd99 (Unknown Source) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xca218030> (a > java.util.concurrent.CompletableFuture$Signaller) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1693) > at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323) > at > java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1729) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) > at > org.apache.beam.sdk.fn.data.CompletableFutureInboundDataClient.awaitCompletion(CompletableFutureInboundDataClient.java:48) > at > org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver.awaitCompletion(BeamFnDataInboundObserver.java:90) > at > org.apache.beam.runners.fnexecution.control.SdkHarnessClient$ActiveBundle.close(SdkHarnessClient.java:298) > at > org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.$closeResource(FlinkExecutableStageFunction.java:209) > at > org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.mapPartition(FlinkExecutableStageFunction.java:209) > at > org.apache.flink.runtime.operators.MapPartitionDriver.run(MapPartitionDriver.java:103) > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:503) > at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712) > at java.lang.Thread.run(Thread.java:748) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7760) Interactive Beam Caching PCollections bound to user defined vars in notebook
[ https://issues.apache.org/jira/browse/BEAM-7760?focusedWorklogId=315981&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315981 ] ASF GitHub Bot logged work on BEAM-7760: Author: ASF GitHub Bot Created on: 20/Sep/19 22:50 Start Date: 20/Sep/19 22:50 Worklog Time Spent: 10m Work Description: KevinGG commented on issue #9619: [BEAM-7760] Added pipeline_instrument module URL: https://github.com/apache/beam/pull/9619#issuecomment-533735184 R:@aaltay PTAL I'll fix the PyLint checks in the PreCommit. There are many warning level (exit code 4) lint reports related to Python2 and Beam pipeline definition in unit test code that fail some Gradle tasks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315981) Time Spent: 9h 20m (was: 9h 10m) > Interactive Beam Caching PCollections bound to user defined vars in notebook > > > Key: BEAM-7760 > URL: https://issues.apache.org/jira/browse/BEAM-7760 > Project: Beam > Issue Type: New Feature > Components: examples-python >Reporter: Ning Kang >Assignee: Ning Kang >Priority: Major > Time Spent: 9h 20m > Remaining Estimate: 0h > > Cache only PCollections bound to user defined variables in a pipeline when > running pipeline with interactive runner in jupyter notebooks. > [Interactive > Beam|[https://github.com/apache/beam/tree/master/sdks/python/apache_beam/runners/interactive]] > has been caching and using caches of "leaf" PCollections for interactive > execution in jupyter notebooks. > The interactive execution is currently supported so that when appending new > transforms to existing pipeline for a new run, executed part of the pipeline > doesn't need to be re-executed. > A PCollection is "leaf" when it is never used as input in any PTransform in > the pipeline. > The problem with building caches and pipeline to execute around "leaf" is > that when a PCollection is consumed by a sink with no output, the pipeline to > execute built will miss the subgraph generating and consuming that > PCollection. > An example, "ReadFromPubSub --> WirteToPubSub" will result in an empty > pipeline. > Caching around PCollections bound to user defined variables and replacing > transforms with source and sink of caches could resolve the pipeline to > execute properly under the interactive execution scenario. Also, cached > PCollection now can trace back to user code and can be used for user data > visualization if user wants to do it. > E.g., > {code:java} > // ... > p = beam.Pipeline(interactive_runner.InteractiveRunner(), > options=pipeline_options) > messages = p | "Read" >> beam.io.ReadFromPubSub(subscription='...') > messages | "Write" >> beam.io.WriteToPubSub(topic_path) > result = p.run() > // ... > visualize(messages){code} > The interactive runner automatically figures out that PCollection > {code:java} > messages{code} > created by > {code:java} > p | "Read" >> beam.io.ReadFromPubSub(subscription='...'){code} > should be cached and reused if the notebook user appends more transforms. > And once the pipeline gets executed, the user could use any > visualize(PCollection) module to visualize the data statically (batch) or > dynamically (stream) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work stopped] (BEAM-7861) Make it easy to change between multi-process and multi-thread mode for Python Direct runners
[ https://issues.apache.org/jira/browse/BEAM-7861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on BEAM-7861 stopped by Hannah Jiang. -- > Make it easy to change between multi-process and multi-thread mode for Python > Direct runners > > > Key: BEAM-7861 > URL: https://issues.apache.org/jira/browse/BEAM-7861 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > > BEAM-3645 makes it possible to run a map task parallel. > However, users need to change runner when switch between multithreading and > multiprocessing mode. > We want to add a flag (ex: --use-multiprocess) to make the switch easy > without changing the runner each time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7060) Design Py3-compatible typehints annotation support in Beam 3.
[ https://issues.apache.org/jira/browse/BEAM-7060?focusedWorklogId=315980&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315980 ] ASF GitHub Bot logged work on BEAM-7060: Author: ASF GitHub Bot Created on: 20/Sep/19 22:48 Start Date: 20/Sep/19 22:48 Worklog Time Spent: 10m Work Description: robertwb commented on pull request #9627: [BEAM-7060] Avoid comparison with param.default in type signature analysis. URL: https://github.com/apache/beam/pull/9627 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315980) Time Spent: 18.5h (was: 18h 20m) > Design Py3-compatible typehints annotation support in Beam 3. > - > > Key: BEAM-7060 > URL: https://issues.apache.org/jira/browse/BEAM-7060 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Udi Meiri >Priority: Major > Fix For: 2.16.0 > > Time Spent: 18.5h > Remaining Estimate: 0h > > Existing [Typehints implementaiton in > Beam|[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/typehints/ > ] heavily relies on internal details of CPython implementation, and some of > the assumptions of this implementation broke as of Python 3.6, see for > example: https://issues.apache.org/jira/browse/BEAM-6877, which makes > typehints support unusable on Python 3.6 as of now. [Python 3 Kanban > Board|https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=245&view=detail] > lists several specific typehints-related breakages, prefixed with "TypeHints > Py3 Error". > We need to decide whether to: > - Deprecate in-house typehints implementation. > - Continue to support in-house implementation, which at this point is a stale > code and has other known issues. > - Attempt to use some off-the-shelf libraries for supporting > type-annotations, like Pytype, Mypy, PyAnnotate. > WRT to this decision we also need to plan on immediate next steps to unblock > adoption of Beam for Python 3.6+ users. One potential option may be to have > Beam SDK ignore any typehint annotations on Py 3.6+. > cc: [~udim], [~altay], [~robertwb]. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-8160) Add instructions about how to set FnApi multi-threads/processes
[ https://issues.apache.org/jira/browse/BEAM-8160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hannah Jiang updated BEAM-8160: --- Fix Version/s: Not applicable > Add instructions about how to set FnApi multi-threads/processes > --- > > Key: BEAM-8160 > URL: https://issues.apache.org/jira/browse/BEAM-8160 > Project: Beam > Issue Type: Task > Components: sdk-py-core >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > Fix For: Not applicable > > Time Spent: 0.5h > Remaining Estimate: 0h > > Add instructions to Beam site or Beam wiki for easy discovery. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (BEAM-7861) Make it easy to change between multi-process and multi-thread mode for Python Direct runners
[ https://issues.apache.org/jira/browse/BEAM-7861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on BEAM-7861 started by Hannah Jiang. -- > Make it easy to change between multi-process and multi-thread mode for Python > Direct runners > > > Key: BEAM-7861 > URL: https://issues.apache.org/jira/browse/BEAM-7861 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > > BEAM-3645 makes it possible to run a map task parallel. > However, users need to change runner when switch between multithreading and > multiprocessing mode. > We want to add a flag (ex: --use-multiprocess) to make the switch easy > without changing the runner each time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8160) Add instructions about how to set FnApi multi-threads/processes
[ https://issues.apache.org/jira/browse/BEAM-8160?focusedWorklogId=315960&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315960 ] ASF GitHub Bot logged work on BEAM-8160: Author: ASF GitHub Bot Created on: 20/Sep/19 22:13 Start Date: 20/Sep/19 22:13 Worklog Time Spent: 10m Work Description: Hannah-Jiang commented on issue #9628: [BEAM-8160] Add FnApi execution mode instruction URL: https://github.com/apache/beam/pull/9628#issuecomment-533725331 R: @soyrice , is it ok to set you as a reviewer? Cc: @aaltay This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315960) Time Spent: 0.5h (was: 20m) > Add instructions about how to set FnApi multi-threads/processes > --- > > Key: BEAM-8160 > URL: https://issues.apache.org/jira/browse/BEAM-8160 > Project: Beam > Issue Type: Task > Components: sdk-py-core >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > Add instructions to Beam site or Beam wiki for easy discovery. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (BEAM-7780) Make it easy to use multiple threads with DirectRunner
[ https://issues.apache.org/jira/browse/BEAM-7780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hannah Jiang closed BEAM-7780. -- Fix Version/s: Not applicable Resolution: Duplicate > Make it easy to use multiple threads with DirectRunner > -- > > Key: BEAM-7780 > URL: https://issues.apache.org/jira/browse/BEAM-7780 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Ahmet Altay >Assignee: Hannah Jiang >Priority: Minor > Fix For: Not applicable > > > This is already supported but not easy to use. It could be simplified by: > * Adding a flag > * OR adding a simple to use documentation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8160) Add instructions about how to set FnApi multi-threads/processes
[ https://issues.apache.org/jira/browse/BEAM-8160?focusedWorklogId=315954&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315954 ] ASF GitHub Bot logged work on BEAM-8160: Author: ASF GitHub Bot Created on: 20/Sep/19 22:04 Start Date: 20/Sep/19 22:04 Worklog Time Spent: 10m Work Description: Hannah-Jiang commented on issue #9628: [BEAM-8160] Add FnApi execution mode instruction URL: https://github.com/apache/beam/pull/9628#issuecomment-533725331 R: @soyrice Is it ok to set you as a reviewer? Cc: @aaltay This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315954) Time Spent: 20m (was: 10m) > Add instructions about how to set FnApi multi-threads/processes > --- > > Key: BEAM-8160 > URL: https://issues.apache.org/jira/browse/BEAM-8160 > Project: Beam > Issue Type: Task > Components: sdk-py-core >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > Add instructions to Beam site or Beam wiki for easy discovery. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (BEAM-8160) Add instructions about how to set FnApi multi-threads/processes
[ https://issues.apache.org/jira/browse/BEAM-8160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on BEAM-8160 started by Hannah Jiang. -- > Add instructions about how to set FnApi multi-threads/processes > --- > > Key: BEAM-8160 > URL: https://issues.apache.org/jira/browse/BEAM-8160 > Project: Beam > Issue Type: Task > Components: sdk-py-core >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Add instructions to Beam site or Beam wiki for easy discovery. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-8160) Add instructions about how to set FnApi multi-threads/processes
[ https://issues.apache.org/jira/browse/BEAM-8160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hannah Jiang updated BEAM-8160: --- Status: Open (was: Triage Needed) > Add instructions about how to set FnApi multi-threads/processes > --- > > Key: BEAM-8160 > URL: https://issues.apache.org/jira/browse/BEAM-8160 > Project: Beam > Issue Type: Task > Components: sdk-py-core >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Add instructions to Beam site or Beam wiki for easy discovery. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8160) Add instructions about how to set FnApi multi-threads/processes
[ https://issues.apache.org/jira/browse/BEAM-8160?focusedWorklogId=315953&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315953 ] ASF GitHub Bot logged work on BEAM-8160: Author: ASF GitHub Bot Created on: 20/Sep/19 22:03 Start Date: 20/Sep/19 22:03 Worklog Time Spent: 10m Work Description: Hannah-Jiang commented on pull request #9628: [BEAM-8160] Add FnApi execution mode instruction URL: https://github.com/apache/beam/pull/9628 Adding an instruction to share how to set multi-threading/multi-processing mode. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://build
[jira] [Resolved] (BEAM-7859) Portable Wordcount on Spark runner does not work in DOCKER execution mode.
[ https://issues.apache.org/jira/browse/BEAM-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kyle Weaver resolved BEAM-7859. --- Fix Version/s: 2.16.0 Resolution: Fixed > Portable Wordcount on Spark runner does not work in DOCKER execution mode. > -- > > Key: BEAM-7859 > URL: https://issues.apache.org/jira/browse/BEAM-7859 > Project: Beam > Issue Type: Bug > Components: runner-spark, sdk-py-harness >Reporter: Valentyn Tymofieiev >Assignee: Kyle Weaver >Priority: Major > Labels: portability-spark > Fix For: 2.16.0 > > Time Spent: 1h > Remaining Estimate: 0h > > The error was observed during Beam 2.14.0 release validation, see: > https://issues.apache.org/jira/browse/BEAM-7224?focusedCommentId=16896831&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16896831 > -Looks like master currently fails with a different error, both in Loopback > and Docker modes.- > [~ibzib] [~altay] [~robertwb] [~angoenka] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-7600) Spark portable runner: reuse SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kyle Weaver resolved BEAM-7600. --- Fix Version/s: 2.16.0 Resolution: Fixed > Spark portable runner: reuse SDK harness > > > Key: BEAM-7600 > URL: https://issues.apache.org/jira/browse/BEAM-7600 > Project: Beam > Issue Type: Improvement > Components: runner-spark >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Labels: portability-spark > Fix For: 2.16.0 > > Time Spent: 8.5h > Remaining Estimate: 0h > > Right now, we're creating a new SDK harness every time an executable stage is > run [1], which is expensive. We should be able to re-use code from the Flink > runner to re-use the SDK harness [2]. > > [1] > [https://github.com/apache/beam/blob/c9fb261bc7666788402840bb6ce1b0ce2fd445d1/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/SparkExecutableStageFunction.java#L135] > [2] > [https://github.com/apache/beam/blob/c9fb261bc7666788402840bb6ce1b0ce2fd445d1/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/functions/FlinkDefaultExecutableStageContext.java] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-8293) Document or log file system issues with docker
Kyle Weaver created BEAM-8293: - Summary: Document or log file system issues with docker Key: BEAM-8293 URL: https://issues.apache.org/jira/browse/BEAM-8293 Project: Beam Issue Type: Improvement Components: java-fn-execution Reporter: Kyle Weaver Assignee: Kyle Weaver A frequently asked question about portability in the mailing list is, "Why am I getting IOExceptions in my job?" where the answer is often, because the SDK harness is using docker, which does not have access to the local filesystem by default, so when users try to read/write via transforms or don't set a artifact_staging_location, they get errors. We should at least document this on the website. Even better would be to log something, especially for artifact_staging_location, which is implicit and users might not be aware of. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-8277) Make docker build quicker
[ https://issues.apache.org/jira/browse/BEAM-8277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kyle Weaver reassigned BEAM-8277: - Assignee: Kyle Weaver > Make docker build quicker > - > > Key: BEAM-8277 > URL: https://issues.apache.org/jira/browse/BEAM-8277 > Project: Beam > Issue Type: Improvement > Components: sdk-py-harness >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > > Building the Python SDK harness container takes minutes on my machine. > Possible lead: "We spend mins pulling cmd/beamctl deps." > [https://github.com/apache/beam/blob/47feeafb21023e2a60ae51737cc4000a2033719c/sdks/python/container/build.gradle#L38] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-7907) Support customized container
[ https://issues.apache.org/jira/browse/BEAM-7907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hannah Jiang updated BEAM-7907: --- Summary: Support customized container (was: Support customized container for python) > Support customized container > > > Key: BEAM-7907 > URL: https://issues.apache.org/jira/browse/BEAM-7907 > Project: Beam > Issue Type: New Feature > Components: build-system, sdk-py-harness >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > Labels: portability > Fix For: 2.16.0 > > > Support customized container. > Scope of this ticket is *python*. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-7907) Support customized container
[ https://issues.apache.org/jira/browse/BEAM-7907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hannah Jiang updated BEAM-7907: --- Description: Support customized container. (was: Support customized container. Scope of this ticket is *python*.) > Support customized container > > > Key: BEAM-7907 > URL: https://issues.apache.org/jira/browse/BEAM-7907 > Project: Beam > Issue Type: New Feature > Components: build-system, sdk-py-harness >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > Labels: portability > Fix For: 2.16.0 > > > Support customized container. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-8105) Add container publishing instruction to release manual
[ https://issues.apache.org/jira/browse/BEAM-8105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hannah Jiang resolved BEAM-8105. Resolution: Fixed > Add container publishing instruction to release manual > -- > > Key: BEAM-8105 > URL: https://issues.apache.org/jira/browse/BEAM-8105 > Project: Beam > Issue Type: Sub-task > Components: website >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > Fix For: 2.16.0 > > Time Spent: 8h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-7881) Get rid of jackson to avoid the continuous flow of CVEs in Jackson
[ https://issues.apache.org/jira/browse/BEAM-7881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16934747#comment-16934747 ] Romain Manni-Bucau commented on BEAM-7881: -- Up, the lack of careness of security by jackson is a real concern which should be addressed IMHO. Any hope to get it fixed soon? > Get rid of jackson to avoid the continuous flow of CVEs in Jackson > -- > > Key: BEAM-7881 > URL: https://issues.apache.org/jira/browse/BEAM-7881 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Affects Versions: 2.14.0 >Reporter: Romain Manni-Bucau >Priority: Blocker > > Jackson keeps having CVE on all releases of databind and transitively beam > sdk java core has CVE on all its releases (for the record, when writing this > issue you must use at least jackson-databind 2.9.9.2 but last week it was > 2.9.9.1 and 2.14 didn't get the fix). > Can be neat to get rid of jackson which does not fix this issue for a very > long time now and just use JSON-B or another JSON impl to ensure the CVE is > not usable because beam is there. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8045) Publish custom windows pattern
[ https://issues.apache.org/jira/browse/BEAM-8045?focusedWorklogId=315925&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315925 ] ASF GitHub Bot logged work on BEAM-8045: Author: ASF GitHub Bot Created on: 20/Sep/19 20:35 Start Date: 20/Sep/19 20:35 Worklog Time Spent: 10m Work Description: aaltay commented on pull request #9406: [BEAM-8045] Custom windows patterns URL: https://github.com/apache/beam/pull/9406 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315925) Time Spent: 2h 40m (was: 2.5h) > Publish custom windows pattern > -- > > Key: BEAM-8045 > URL: https://issues.apache.org/jira/browse/BEAM-8045 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Cyrus Maden >Assignee: Cyrus Maden >Priority: Minor > Time Spent: 2h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7060) Design Py3-compatible typehints annotation support in Beam 3.
[ https://issues.apache.org/jira/browse/BEAM-7060?focusedWorklogId=315924&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315924 ] ASF GitHub Bot logged work on BEAM-7060: Author: ASF GitHub Bot Created on: 20/Sep/19 20:28 Start Date: 20/Sep/19 20:28 Worklog Time Spent: 10m Work Description: tvalentyn commented on issue #9627: [BEAM-7060] Avoid comparison with param.default in type signature analysis. URL: https://github.com/apache/beam/pull/9627#issuecomment-533698262 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315924) Time Spent: 18h 20m (was: 18h 10m) > Design Py3-compatible typehints annotation support in Beam 3. > - > > Key: BEAM-7060 > URL: https://issues.apache.org/jira/browse/BEAM-7060 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Udi Meiri >Priority: Major > Fix For: 2.16.0 > > Time Spent: 18h 20m > Remaining Estimate: 0h > > Existing [Typehints implementaiton in > Beam|[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/typehints/ > ] heavily relies on internal details of CPython implementation, and some of > the assumptions of this implementation broke as of Python 3.6, see for > example: https://issues.apache.org/jira/browse/BEAM-6877, which makes > typehints support unusable on Python 3.6 as of now. [Python 3 Kanban > Board|https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=245&view=detail] > lists several specific typehints-related breakages, prefixed with "TypeHints > Py3 Error". > We need to decide whether to: > - Deprecate in-house typehints implementation. > - Continue to support in-house implementation, which at this point is a stale > code and has other known issues. > - Attempt to use some off-the-shelf libraries for supporting > type-annotations, like Pytype, Mypy, PyAnnotate. > WRT to this decision we also need to plan on immediate next steps to unblock > adoption of Beam for Python 3.6+ users. One potential option may be to have > Beam SDK ignore any typehint annotations on Py 3.6+. > cc: [~udim], [~altay], [~robertwb]. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (BEAM-7599) Python SDK: IntervalWindow cannot be cast to GlobalWindow on Cloud Dataflow Runner
[ https://issues.apache.org/jira/browse/BEAM-7599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Patoch closed BEAM-7599. - Fix Version/s: Not applicable Resolution: Abandoned > Python SDK: IntervalWindow cannot be cast to GlobalWindow on Cloud Dataflow > Runner > -- > > Key: BEAM-7599 > URL: https://issues.apache.org/jira/browse/BEAM-7599 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Affects Versions: 2.13.0 >Reporter: John Patoch >Priority: Minor > Fix For: Not applicable > > > Getting an error after deploying a pipeline built with the Python SDK on the > Cloud Dataflow Runner. > > -> The pipeline run seamlessly on the local DirectRunner. > {code:java|title=Stackdriver Trace|borderStyle=solid} > java.lang.ClassCastException: > org.apache.beam.sdk.transforms.windowing.IntervalWindow cannot be cast to > org.apache.beam.sdk.transforms.windowing.GlobalWindow at > org.apache.beam.sdk.transforms.windowing.GlobalWindow$Coder.encode(GlobalWindow.java:59) > at org.apache.beam.sdk.coders.Coder.encode(Coder.java:136) at > org.apache.beam.sdk.util.CoderUtils.encodeToSafeStream(CoderUtils.java:82) at > org.apache.beam.sdk.util.CoderUtils.encodeToByteArray(CoderUtils.java:66) at > org.apache.beam.sdk.util.CoderUtils.encodeToByteArray(CoderUtils.java:51) at > org.apache.beam.sdk.util.CoderUtils.encodeToBase64(CoderUtils.java:151) at > org.apache.beam.runners.core.StateNamespaces$WindowNamespace.appendTo(StateNamespaces.java:117) > at > org.apache.beam.runners.dataflow.worker.WindmillStateInternals.encodeKey(WindmillStateInternals.java:256) > at > org.apache.beam.runners.dataflow.worker.WindmillStateInternals$WindmillValue.(WindmillStateInternals.java:359) > at > org.apache.beam.runners.dataflow.worker.WindmillStateInternals$WindmillValue.(WindmillStateInternals.java:337) > at > org.apache.beam.runners.dataflow.worker.WindmillStateInternals$CachingStateTable$1.bindValue(WindmillStateInternals.java:174) > at org.apache.beam.runners.core.StateTags$2.bindValue(StateTags.java:69) at > org.apache.beam.sdk.state.StateSpecs$ValueStateSpec.bind(StateSpecs.java:276) > at > org.apache.beam.sdk.state.StateSpecs$ValueStateSpec.bind(StateSpecs.java:266) > at > org.apache.beam.runners.core.StateTags$SimpleStateTag.bind(StateTags.java:296) > at org.apache.beam.runners.core.StateTable.get(StateTable.java:60) at > org.apache.beam.runners.dataflow.worker.WindmillStateInternals.state(WindmillStateInternals.java:334) > at > org.apache.beam.runners.core.ReduceFnContextFactory$StateAccessorImpl.access(ReduceFnContextFactory.java:207) > at > org.apache.beam.runners.core.triggers.TriggerStateMachineRunner.isClosed(TriggerStateMachineRunner.java:99) > at > org.apache.beam.runners.core.ReduceFnRunner.windowsThatAreOpen(ReduceFnRunner.java:275) > at > org.apache.beam.runners.core.ReduceFnRunner.processElements(ReduceFnRunner.java:345) > at > org.apache.beam.runners.dataflow.worker.StreamingGroupAlsoByWindowViaWindowSetFn.processElement(StreamingGroupAlsoByWindowViaWindowSetFn.java:94) > at > org.apache.beam.runners.dataflow.worker.StreamingGroupAlsoByWindowViaWindowSetFn.processElement(StreamingGroupAlsoByWindowViaWindowSetFn.java:42) > at > org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner.invokeProcessElement(GroupAlsoByWindowFnRunner.java:115) > at > org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner.processElement(GroupAlsoByWindowFnRunner.java:73) > at > org.apache.beam.runners.core.LateDataDroppingDoFnRunner.processElement(LateDataDroppingDoFnRunner.java:80) > at > org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowsParDoFn.processElement(GroupAlsoByWindowsParDoFn.java:134) > at > org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:44) > at > org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:49) > at > org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:201) > at > org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:159) > at > org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:77) > at > org.apache.beam.runners.dataflow.worker.fn.control.BeamFnMapTaskExecutor.execute(BeamFnMapTaskExecutor.java:125) > at > org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1283) > at > org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:147) > at > org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflow
[jira] [Work logged] (BEAM-8157) Key encoding for state requests is not consistent across SDKs
[ https://issues.apache.org/jira/browse/BEAM-8157?focusedWorklogId=315865&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315865 ] ASF GitHub Bot logged work on BEAM-8157: Author: ASF GitHub Bot Created on: 20/Sep/19 18:18 Start Date: 20/Sep/19 18:18 Worklog Time Spent: 10m Work Description: mxm commented on issue #9484: [BEAM-8157] Ensure key encoding for state requests is consistent across SDKs URL: https://github.com/apache/beam/pull/9484#issuecomment-533659592 Any opinions here? I think this warrants some attention to finally fix the inconsistent key encoding across SDKs. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315865) Time Spent: 5h (was: 4h 50m) > Key encoding for state requests is not consistent across SDKs > - > > Key: BEAM-8157 > URL: https://issues.apache.org/jira/browse/BEAM-8157 > Project: Beam > Issue Type: Bug > Components: runner-flink >Affects Versions: 2.13.0 >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Major > Fix For: 2.17.0 > > Time Spent: 5h > Remaining Estimate: 0h > > The Flink runner requires the internal key to be encoded without a length > prefix (OUTER context). The user state request handler exposes a serialized > version of the key to the Runner. This key is encoded with the NESTED context > which may add a length prefix. We need to convert it to OUTER context to > match the Flink runner's key encoding. > So far this has not caused the Flink Runner to behave incorrectly. However, > with the upcoming support for Flink 1.9, the state backend will not accept > requests for keys not part of any key group/partition of the operator. This > is very likely to happen with the encoding not being consistent. > **NOTE** This is only applicable to the Java SDK, as the Python SDK uses > OUTER encoding for the key in state requests. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-8212) StatefulParDoFn creates GC timers for every record
[ https://issues.apache.org/jira/browse/BEAM-8212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16934636#comment-16934636 ] Maximilian Michels commented on BEAM-8212: -- Interesting, I've not thought about the cost of adding timers for the global window. It looks like we could skip them entirely. Perhaps this should be handled for Flink only, instead of in the StatefulDoFnRunner. Do you have any numbers on the potential slowdown? > StatefulParDoFn creates GC timers for every record > --- > > Key: BEAM-8212 > URL: https://issues.apache.org/jira/browse/BEAM-8212 > Project: Beam > Issue Type: Bug > Components: beam-community >Reporter: Akshay Iyangar >Assignee: Aizhamal Nurmamat kyzy >Priority: Major > > Hi > So currently the StatefulParDoFn create timers for all the records. > [https://github.com/apache/beam/blob/master/runners/core-java/src/main/java/org/apache/beam/runners/core/StatefulDoFnRunner.java#L211] > This becomes a problem if you are using GlobalWindows for streaming where > these timers get created and never get closed since the window will never > close. > This is a problem especially if your memory bound in rocksDB where these > timers take up potential space and sloe the pipelines considerably. > Was wondering that if the pipeline runs in global windows we should avoid > adding timers to it at all? > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-8292) Add a Reshuffle PTransform preventing fusion of the surrounding transforms
[ https://issues.apache.org/jira/browse/BEAM-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Patoch updated BEAM-8292: -- Status: Open (was: Triage Needed) > Add a Reshuffle PTransform preventing fusion of the surrounding transforms > -- > > Key: BEAM-8292 > URL: https://issues.apache.org/jira/browse/BEAM-8292 > Project: Beam > Issue Type: New Feature > Components: sdk-go >Reporter: John Patoch >Priority: Minor > > Reshuffle is a PTransform that takes a PCollection and shuffles the data > to help increase parallelism. > Reshuffle adds a temporary random key to each element, performs a > GroupByKey, and finally removes the temporary key. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-8292) Add a Reshuffle PTransform preventing fusion of the surrounding transforms
John Patoch created BEAM-8292: - Summary: Add a Reshuffle PTransform preventing fusion of the surrounding transforms Key: BEAM-8292 URL: https://issues.apache.org/jira/browse/BEAM-8292 Project: Beam Issue Type: New Feature Components: sdk-go Reporter: John Patoch Reshuffle is a PTransform that takes a PCollection and shuffles the data to help increase parallelism. Reshuffle adds a temporary random key to each element, performs a GroupByKey, and finally removes the temporary key. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-8212) StatefulParDoFn creates GC timers for every record
[ https://issues.apache.org/jira/browse/BEAM-8212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16934618#comment-16934618 ] Akshay Iyangar commented on BEAM-8212: -- [~mxm] - Hey can you help me with this ?? > StatefulParDoFn creates GC timers for every record > --- > > Key: BEAM-8212 > URL: https://issues.apache.org/jira/browse/BEAM-8212 > Project: Beam > Issue Type: Bug > Components: beam-community >Reporter: Akshay Iyangar >Assignee: Aizhamal Nurmamat kyzy >Priority: Major > > Hi > So currently the StatefulParDoFn create timers for all the records. > [https://github.com/apache/beam/blob/master/runners/core-java/src/main/java/org/apache/beam/runners/core/StatefulDoFnRunner.java#L211] > This becomes a problem if you are using GlobalWindows for streaming where > these timers get created and never get closed since the window will never > close. > This is a problem especially if your memory bound in rocksDB where these > timers take up potential space and sloe the pipelines considerably. > Was wondering that if the pipeline runs in global windows we should avoid > adding timers to it at all? > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8045) Publish custom windows pattern
[ https://issues.apache.org/jira/browse/BEAM-8045?focusedWorklogId=315804&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315804 ] ASF GitHub Bot logged work on BEAM-8045: Author: ASF GitHub Bot Created on: 20/Sep/19 17:05 Start Date: 20/Sep/19 17:05 Worklog Time Spent: 10m Work Description: soyrice commented on pull request #9406: [BEAM-8045] Custom windows patterns URL: https://github.com/apache/beam/pull/9406#discussion_r326721404 ## File path: website/src/documentation/patterns/custom-windows.md ## @@ -0,0 +1,112 @@ +--- +layout: section +title: "Custom window patterns" +section_menu: section-menu/documentation.html +permalink: /documentation/patterns/custom-windows/ +--- + + +# Custom window patterns +The samples on this page demonstrate common custom window patterns. You can create custom windows with [`WindowFn` functions]({{ site.baseurl }}/documentation/programming-guide/#provided-windowing-functions). For more information, see the [programming guide section on windowing]({{ site.baseurl }}/documentation/programming-guide/#windowing). + +**Note**: Custom merging windows isn't supported in Python. + +## Using data to dynamically set session window gaps + +You can modify the [`assignWindows`](https://beam.apache.org/releases/javadoc/current/index.html?org/apache/beam/sdk/transforms/windowing/SlidingWindows.html) function to use data-driven gaps instead of fixed windows. + +Access the `assignWindows` function through `WindowFn.AssignContext.element()`. The original, fixed-duration `assignWindows` function is: + +```java +{% github_sample /apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/snippets/Snippets.java tag:CustomSessionWindow1 +%} +``` + +### Creating data-driven gaps +To use data-driven gaps, add the following snippets to the `assignWindows` function: +- A default value for when the custom gap is not present in the data +- A way to set the attribute from the main pipeline as a method of the custom windows + +For example, the following function assigns each element to a window between the timestamp and `gapDuration`: + +```java +{% github_sample /apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/snippets/Snippets.java tag:CustomSessionWindow3 +%} +``` + +In this function, the `withDefaultGapDuration` and `withGapAttribute` methods are: + +```java +{% github_sample /apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/snippets/Snippets.java tag:CustomSessionWindow4 +%} +``` + +Then, the new `gapAttribute` field and constructor dynamically create session windows with the calculated gap duration: + +```java +{% github_sample /apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/snippets/Snippets.java tag:CustomSessionWindow2 +%} +``` + +### Windowing messages into sessions +After creating data-driven gaps, you can window Pub/Sub messages into the new, custom sessions: Review comment: Good point. Taking out Pub/Sub completely since it's not really necessary to understand the example This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315804) Time Spent: 2.5h (was: 2h 20m) > Publish custom windows pattern > -- > > Key: BEAM-8045 > URL: https://issues.apache.org/jira/browse/BEAM-8045 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Cyrus Maden >Assignee: Cyrus Maden >Priority: Minor > Time Spent: 2.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8045) Publish custom windows pattern
[ https://issues.apache.org/jira/browse/BEAM-8045?focusedWorklogId=315803&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315803 ] ASF GitHub Bot logged work on BEAM-8045: Author: ASF GitHub Bot Created on: 20/Sep/19 17:05 Start Date: 20/Sep/19 17:05 Worklog Time Spent: 10m Work Description: soyrice commented on pull request #9406: [BEAM-8045] Custom windows patterns URL: https://github.com/apache/beam/pull/9406#discussion_r326721215 ## File path: website/src/documentation/patterns/custom-windows.md ## @@ -0,0 +1,112 @@ +--- +layout: section +title: "Custom window patterns" +section_menu: section-menu/documentation.html +permalink: /documentation/patterns/custom-windows/ +--- + + +# Custom window patterns +The samples on this page demonstrate common custom window patterns. You can create custom windows with [`WindowFn` functions]({{ site.baseurl }}/documentation/programming-guide/#provided-windowing-functions). For more information, see the [programming guide section on windowing]({{ site.baseurl }}/documentation/programming-guide/#windowing). + +**Note**: Custom merging windows isn't supported in Python. Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315803) Time Spent: 2h 20m (was: 2h 10m) > Publish custom windows pattern > -- > > Key: BEAM-8045 > URL: https://issues.apache.org/jira/browse/BEAM-8045 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Cyrus Maden >Assignee: Cyrus Maden >Priority: Minor > Time Spent: 2h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8045) Publish custom windows pattern
[ https://issues.apache.org/jira/browse/BEAM-8045?focusedWorklogId=315802&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315802 ] ASF GitHub Bot logged work on BEAM-8045: Author: ASF GitHub Bot Created on: 20/Sep/19 17:04 Start Date: 20/Sep/19 17:04 Worklog Time Spent: 10m Work Description: soyrice commented on pull request #9406: [BEAM-8045] Custom windows patterns URL: https://github.com/apache/beam/pull/9406#discussion_r326721155 ## File path: website/src/documentation/patterns/custom-windows.md ## @@ -0,0 +1,112 @@ +--- +layout: section +title: "Custom window patterns" +section_menu: section-menu/documentation.html +permalink: /documentation/patterns/custom-windows/ +--- + + +# Custom window patterns +The samples on this page demonstrate common custom window patterns. You can create custom windows with [`WindowFn` functions]({{ site.baseurl }}/documentation/programming-guide/#provided-windowing-functions). For more information, see the [programming guide section on windowing]({{ site.baseurl }}/documentation/programming-guide/#windowing). + +**Note**: Custom merging windows isn't supported in Python. + +## Using data to dynamically set session window gaps + +You can modify the [`assignWindows`](https://beam.apache.org/releases/javadoc/current/index.html?org/apache/beam/sdk/transforms/windowing/SlidingWindows.html) function to use data-driven gaps instead of fixed windows. + +Access the `assignWindows` function through `WindowFn.AssignContext.element()`. The original, fixed-duration `assignWindows` function is: + +```java +{% github_sample /apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/snippets/Snippets.java tag:CustomSessionWindow1 +%} +``` + +### Creating data-driven gaps +To use data-driven gaps, add the following snippets to the `assignWindows` function: +- A default value for when the custom gap is not present in the data +- A way to set the attribute from the main pipeline as a method of the custom windows + +For example, the following function assigns each element to a window between the timestamp and `gapDuration`: + +```java +{% github_sample /apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/snippets/Snippets.java tag:CustomSessionWindow3 +%} +``` + +In this function, the `withDefaultGapDuration` and `withGapAttribute` methods are: + +```java +{% github_sample /apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/snippets/Snippets.java tag:CustomSessionWindow4 Review comment: Should be fixed now :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315802) Time Spent: 2h 10m (was: 2h) > Publish custom windows pattern > -- > > Key: BEAM-8045 > URL: https://issues.apache.org/jira/browse/BEAM-8045 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Cyrus Maden >Assignee: Cyrus Maden >Priority: Minor > Time Spent: 2h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8045) Publish custom windows pattern
[ https://issues.apache.org/jira/browse/BEAM-8045?focusedWorklogId=315801&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315801 ] ASF GitHub Bot logged work on BEAM-8045: Author: ASF GitHub Bot Created on: 20/Sep/19 17:03 Start Date: 20/Sep/19 17:03 Worklog Time Spent: 10m Work Description: soyrice commented on pull request #9406: [BEAM-8045] Custom windows patterns URL: https://github.com/apache/beam/pull/9406#discussion_r326720531 ## File path: website/src/documentation/patterns/custom-windows.md ## @@ -0,0 +1,112 @@ +--- +layout: section +title: "Custom window patterns" +section_menu: section-menu/documentation.html +permalink: /documentation/patterns/custom-windows/ +--- + + +# Custom window patterns +The samples on this page demonstrate common custom window patterns. You can create custom windows with [`WindowFn` functions]({{ site.baseurl }}/documentation/programming-guide/#provided-windowing-functions). For more information, see the [programming guide section on windowing]({{ site.baseurl }}/documentation/programming-guide/#windowing). + +**Note**: Custom merging windows isn't supported in Python. + +## Using data to dynamically set session window gaps + +You can modify the [`assignWindows`](https://beam.apache.org/releases/javadoc/current/index.html?org/apache/beam/sdk/transforms/windowing/SlidingWindows.html) function to use data-driven gaps instead of fixed windows. + +Access the `assignWindows` function through `WindowFn.AssignContext.element()`. The original, fixed-duration `assignWindows` function is: + +```java +{% github_sample /apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/snippets/Snippets.java tag:CustomSessionWindow1 +%} +``` + +### Creating data-driven gaps +To use data-driven gaps, add the following snippets to the `assignWindows` function: +- A default value for when the custom gap is not present in the data +- A way to set the attribute from the main pipeline as a method of the custom windows + +For example, the following function assigns each element to a window between the timestamp and `gapDuration`: + +```java +{% github_sample /apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/snippets/Snippets.java tag:CustomSessionWindow3 +%} +``` + +In this function, the `withDefaultGapDuration` and `withGapAttribute` methods are: Review comment: Updated/clarified. It's just the first two lines. The region tag for this code snippet was broken but I fixed it in this PR This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315801) Time Spent: 2h (was: 1h 50m) > Publish custom windows pattern > -- > > Key: BEAM-8045 > URL: https://issues.apache.org/jira/browse/BEAM-8045 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Cyrus Maden >Assignee: Cyrus Maden >Priority: Minor > Time Spent: 2h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8045) Publish custom windows pattern
[ https://issues.apache.org/jira/browse/BEAM-8045?focusedWorklogId=315793&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315793 ] ASF GitHub Bot logged work on BEAM-8045: Author: ASF GitHub Bot Created on: 20/Sep/19 16:55 Start Date: 20/Sep/19 16:55 Worklog Time Spent: 10m Work Description: aaltay commented on pull request #9406: [BEAM-8045] Custom windows patterns URL: https://github.com/apache/beam/pull/9406#discussion_r326716385 ## File path: website/src/documentation/patterns/custom-windows.md ## @@ -0,0 +1,112 @@ +--- +layout: section +title: "Custom window patterns" +section_menu: section-menu/documentation.html +permalink: /documentation/patterns/custom-windows/ +--- + + +# Custom window patterns +The samples on this page demonstrate common custom window patterns. You can create custom windows with [`WindowFn` functions]({{ site.baseurl }}/documentation/programming-guide/#provided-windowing-functions). For more information, see the [programming guide section on windowing]({{ site.baseurl }}/documentation/programming-guide/#windowing). + +**Note**: Custom merging windows isn't supported in Python. Review comment: "in Python (with fnapi)" (otherwise supported in old python batch.) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315793) Time Spent: 1h 40m (was: 1.5h) > Publish custom windows pattern > -- > > Key: BEAM-8045 > URL: https://issues.apache.org/jira/browse/BEAM-8045 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Cyrus Maden >Assignee: Cyrus Maden >Priority: Minor > Time Spent: 1h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8045) Publish custom windows pattern
[ https://issues.apache.org/jira/browse/BEAM-8045?focusedWorklogId=315791&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315791 ] ASF GitHub Bot logged work on BEAM-8045: Author: ASF GitHub Bot Created on: 20/Sep/19 16:55 Start Date: 20/Sep/19 16:55 Worklog Time Spent: 10m Work Description: aaltay commented on pull request #9406: [BEAM-8045] Custom windows patterns URL: https://github.com/apache/beam/pull/9406#discussion_r326717581 ## File path: website/src/documentation/patterns/custom-windows.md ## @@ -0,0 +1,112 @@ +--- +layout: section +title: "Custom window patterns" +section_menu: section-menu/documentation.html +permalink: /documentation/patterns/custom-windows/ +--- + + +# Custom window patterns +The samples on this page demonstrate common custom window patterns. You can create custom windows with [`WindowFn` functions]({{ site.baseurl }}/documentation/programming-guide/#provided-windowing-functions). For more information, see the [programming guide section on windowing]({{ site.baseurl }}/documentation/programming-guide/#windowing). + +**Note**: Custom merging windows isn't supported in Python. + +## Using data to dynamically set session window gaps + +You can modify the [`assignWindows`](https://beam.apache.org/releases/javadoc/current/index.html?org/apache/beam/sdk/transforms/windowing/SlidingWindows.html) function to use data-driven gaps instead of fixed windows. + +Access the `assignWindows` function through `WindowFn.AssignContext.element()`. The original, fixed-duration `assignWindows` function is: + +```java +{% github_sample /apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/snippets/Snippets.java tag:CustomSessionWindow1 +%} +``` + +### Creating data-driven gaps +To use data-driven gaps, add the following snippets to the `assignWindows` function: +- A default value for when the custom gap is not present in the data +- A way to set the attribute from the main pipeline as a method of the custom windows + +For example, the following function assigns each element to a window between the timestamp and `gapDuration`: + +```java +{% github_sample /apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/snippets/Snippets.java tag:CustomSessionWindow3 +%} +``` + +In this function, the `withDefaultGapDuration` and `withGapAttribute` methods are: + +```java +{% github_sample /apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/snippets/Snippets.java tag:CustomSessionWindow4 +%} +``` + +Then, the new `gapAttribute` field and constructor dynamically create session windows with the calculated gap duration: + +```java +{% github_sample /apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/snippets/Snippets.java tag:CustomSessionWindow2 +%} +``` + +### Windowing messages into sessions +After creating data-driven gaps, you can window Pub/Sub messages into the new, custom sessions: Review comment: How do we convert from pub/sub to tablerows? Should we mention that a conversion needs to happen. Or maybe not mention pubsub at all. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315791) Time Spent: 1.5h (was: 1h 20m) > Publish custom windows pattern > -- > > Key: BEAM-8045 > URL: https://issues.apache.org/jira/browse/BEAM-8045 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Cyrus Maden >Assignee: Cyrus Maden >Priority: Minor > Time Spent: 1.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8045) Publish custom windows pattern
[ https://issues.apache.org/jira/browse/BEAM-8045?focusedWorklogId=315794&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315794 ] ASF GitHub Bot logged work on BEAM-8045: Author: ASF GitHub Bot Created on: 20/Sep/19 16:55 Start Date: 20/Sep/19 16:55 Worklog Time Spent: 10m Work Description: aaltay commented on pull request #9406: [BEAM-8045] Custom windows patterns URL: https://github.com/apache/beam/pull/9406#discussion_r326716143 ## File path: website/src/documentation/patterns/custom-windows.md ## @@ -0,0 +1,112 @@ +--- +layout: section +title: "Custom window patterns" +section_menu: section-menu/documentation.html +permalink: /documentation/patterns/custom-windows/ +--- + + +# Custom window patterns +The samples on this page demonstrate common custom window patterns. You can create custom windows with [`WindowFn` functions]({{ site.baseurl }}/documentation/programming-guide/#provided-windowing-functions). For more information, see the [programming guide section on windowing]({{ site.baseurl }}/documentation/programming-guide/#windowing). + +**Note**: Custom merging windows isn't supported in Python. + +## Using data to dynamically set session window gaps + +You can modify the [`assignWindows`](https://beam.apache.org/releases/javadoc/current/index.html?org/apache/beam/sdk/transforms/windowing/SlidingWindows.html) function to use data-driven gaps instead of fixed windows. + +Access the `assignWindows` function through `WindowFn.AssignContext.element()`. The original, fixed-duration `assignWindows` function is: + +```java +{% github_sample /apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/snippets/Snippets.java tag:CustomSessionWindow1 +%} +``` + +### Creating data-driven gaps +To use data-driven gaps, add the following snippets to the `assignWindows` function: +- A default value for when the custom gap is not present in the data +- A way to set the attribute from the main pipeline as a method of the custom windows + +For example, the following function assigns each element to a window between the timestamp and `gapDuration`: + +```java +{% github_sample /apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/snippets/Snippets.java tag:CustomSessionWindow3 +%} +``` + +In this function, the `withDefaultGapDuration` and `withGapAttribute` methods are: Review comment: In which function? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315794) Time Spent: 1h 50m (was: 1h 40m) > Publish custom windows pattern > -- > > Key: BEAM-8045 > URL: https://issues.apache.org/jira/browse/BEAM-8045 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Cyrus Maden >Assignee: Cyrus Maden >Priority: Minor > Time Spent: 1h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8045) Publish custom windows pattern
[ https://issues.apache.org/jira/browse/BEAM-8045?focusedWorklogId=315792&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315792 ] ASF GitHub Bot logged work on BEAM-8045: Author: ASF GitHub Bot Created on: 20/Sep/19 16:55 Start Date: 20/Sep/19 16:55 Worklog Time Spent: 10m Work Description: aaltay commented on pull request #9406: [BEAM-8045] Custom windows patterns URL: https://github.com/apache/beam/pull/9406#discussion_r326716682 ## File path: website/src/documentation/patterns/custom-windows.md ## @@ -0,0 +1,112 @@ +--- +layout: section +title: "Custom window patterns" +section_menu: section-menu/documentation.html +permalink: /documentation/patterns/custom-windows/ +--- + + +# Custom window patterns +The samples on this page demonstrate common custom window patterns. You can create custom windows with [`WindowFn` functions]({{ site.baseurl }}/documentation/programming-guide/#provided-windowing-functions). For more information, see the [programming guide section on windowing]({{ site.baseurl }}/documentation/programming-guide/#windowing). + +**Note**: Custom merging windows isn't supported in Python. + +## Using data to dynamically set session window gaps + +You can modify the [`assignWindows`](https://beam.apache.org/releases/javadoc/current/index.html?org/apache/beam/sdk/transforms/windowing/SlidingWindows.html) function to use data-driven gaps instead of fixed windows. + +Access the `assignWindows` function through `WindowFn.AssignContext.element()`. The original, fixed-duration `assignWindows` function is: + +```java +{% github_sample /apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/snippets/Snippets.java tag:CustomSessionWindow1 +%} +``` + +### Creating data-driven gaps +To use data-driven gaps, add the following snippets to the `assignWindows` function: +- A default value for when the custom gap is not present in the data +- A way to set the attribute from the main pipeline as a method of the custom windows + +For example, the following function assigns each element to a window between the timestamp and `gapDuration`: + +```java +{% github_sample /apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/snippets/Snippets.java tag:CustomSessionWindow3 +%} +``` + +In this function, the `withDefaultGapDuration` and `withGapAttribute` methods are: + +```java +{% github_sample /apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/snippets/Snippets.java tag:CustomSessionWindow4 Review comment: I believe this snippets start a bit abruptly. in the staged version, then parenthesis does not look balanced. (this might be ok.) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315792) Time Spent: 1.5h (was: 1h 20m) > Publish custom windows pattern > -- > > Key: BEAM-8045 > URL: https://issues.apache.org/jira/browse/BEAM-8045 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Cyrus Maden >Assignee: Cyrus Maden >Priority: Minor > Time Spent: 1.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8003) Remove all mentions of PKB on Confluence / website docs
[ https://issues.apache.org/jira/browse/BEAM-8003?focusedWorklogId=315789&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315789 ] ASF GitHub Bot logged work on BEAM-8003: Author: ASF GitHub Bot Created on: 20/Sep/19 16:50 Start Date: 20/Sep/19 16:50 Worklog Time Spent: 10m Work Description: musicnova commented on pull request #9626: [BEAM-8003] pyjobs init commit URL: https://github.com/apache/beam/pull/9626 **Please** add a meaningful description for your change here Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/
[jira] [Work logged] (BEAM-8045) Publish custom windows pattern
[ https://issues.apache.org/jira/browse/BEAM-8045?focusedWorklogId=315776&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315776 ] ASF GitHub Bot logged work on BEAM-8045: Author: ASF GitHub Bot Created on: 20/Sep/19 16:27 Start Date: 20/Sep/19 16:27 Worklog Time Spent: 10m Work Description: soyrice commented on issue #9406: [BEAM-8045] Custom windows patterns URL: https://github.com/apache/beam/pull/9406#issuecomment-533622765 > Could you share a link to the staged version? http://apache-beam-website-pull-requests.storage.googleapis.com/9406/documentation/patterns/overview/index.html This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315776) Time Spent: 1h 10m (was: 1h) > Publish custom windows pattern > -- > > Key: BEAM-8045 > URL: https://issues.apache.org/jira/browse/BEAM-8045 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Cyrus Maden >Assignee: Cyrus Maden >Priority: Minor > Time Spent: 1h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8045) Publish custom windows pattern
[ https://issues.apache.org/jira/browse/BEAM-8045?focusedWorklogId=315777&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315777 ] ASF GitHub Bot logged work on BEAM-8045: Author: ASF GitHub Bot Created on: 20/Sep/19 16:27 Start Date: 20/Sep/19 16:27 Worklog Time Spent: 10m Work Description: soyrice commented on issue #9406: [BEAM-8045] Custom windows patterns URL: https://github.com/apache/beam/pull/9406#issuecomment-533622861 > @soyrice looks like there a bunch of validation errors. Do you know what they are? Looks like most were broken links. Should be ready to go now :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315777) Time Spent: 1h 20m (was: 1h 10m) > Publish custom windows pattern > -- > > Key: BEAM-8045 > URL: https://issues.apache.org/jira/browse/BEAM-8045 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Cyrus Maden >Assignee: Cyrus Maden >Priority: Minor > Time Spent: 1h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8275) Beam SQL should support BigQuery in DIRECT_READ mode
[ https://issues.apache.org/jira/browse/BEAM-8275?focusedWorklogId=315782&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315782 ] ASF GitHub Bot logged work on BEAM-8275: Author: ASF GitHub Bot Created on: 20/Sep/19 16:33 Start Date: 20/Sep/19 16:33 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9625: [BEAM-8275] Beam SQL should support BigQuery in DIRECT_READ mode URL: https://github.com/apache/beam/pull/9625 Add a table property 'method' to BigQuery. When no property specified, use Method.DEFAULT. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badg
[jira] [Work logged] (BEAM-8275) Beam SQL should support BigQuery in DIRECT_READ mode
[ https://issues.apache.org/jira/browse/BEAM-8275?focusedWorklogId=315784&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315784 ] ASF GitHub Bot logged work on BEAM-8275: Author: ASF GitHub Bot Created on: 20/Sep/19 16:33 Start Date: 20/Sep/19 16:33 Worklog Time Spent: 10m Work Description: 11moon11 commented on issue #9625: [BEAM-8275] Beam SQL should support BigQuery in DIRECT_READ mode URL: https://github.com/apache/beam/pull/9625#issuecomment-533624666 R: @apilloud This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315784) Time Spent: 40m (was: 0.5h) > Beam SQL should support BigQuery in DIRECT_READ mode > > > Key: BEAM-8275 > URL: https://issues.apache.org/jira/browse/BEAM-8275 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > SQL currently only supports reading from BigQuery in DEFAULT (EXPORT) mode. > We also need to support DIRECT_READ mode. The method should be configurable > by TBLPROPERTIES through the SQL CLI. This will enable us to take advantage > of the DIRECT_READ features for filter and project push down. > References: > [https://beam.apache.org/documentation/io/built-in/google-bigquery/#storage-api] > [https://beam.apache.org/blog/2019/06/04/adding-data-sources-to-sql.html] > [https://github.com/apache/beam/blob/c2f0d282337f3ae0196a7717712396a5a41fdde1/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/bigquery/BigQueryTable.java] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8045) Publish custom windows pattern
[ https://issues.apache.org/jira/browse/BEAM-8045?focusedWorklogId=315755&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315755 ] ASF GitHub Bot logged work on BEAM-8045: Author: ASF GitHub Bot Created on: 20/Sep/19 16:09 Start Date: 20/Sep/19 16:09 Worklog Time Spent: 10m Work Description: aaltay commented on issue #9406: [BEAM-8045] Custom windows patterns URL: https://github.com/apache/beam/pull/9406#issuecomment-533616938 Could you share a link to the staged version? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315755) Time Spent: 1h (was: 50m) > Publish custom windows pattern > -- > > Key: BEAM-8045 > URL: https://issues.apache.org/jira/browse/BEAM-8045 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Cyrus Maden >Assignee: Cyrus Maden >Priority: Minor > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8003) Remove all mentions of PKB on Confluence / website docs
[ https://issues.apache.org/jira/browse/BEAM-8003?focusedWorklogId=315736&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315736 ] ASF GitHub Bot logged work on BEAM-8003: Author: ASF GitHub Bot Created on: 20/Sep/19 15:15 Start Date: 20/Sep/19 15:15 Worklog Time Spent: 10m Work Description: lgajowy commented on issue #9450: [BEAM-8003] Remove Perfkit leftovers URL: https://github.com/apache/beam/pull/9450#issuecomment-533594991 Thank you all! :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315736) Time Spent: 4h (was: 3h 50m) > Remove all mentions of PKB on Confluence / website docs > --- > > Key: BEAM-8003 > URL: https://issues.apache.org/jira/browse/BEAM-8003 > Project: Beam > Issue Type: Sub-task > Components: testing, website >Reporter: Lukasz Gajowy >Assignee: Lukasz Gajowy >Priority: Major > Time Spent: 4h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8003) Remove all mentions of PKB on Confluence / website docs
[ https://issues.apache.org/jira/browse/BEAM-8003?focusedWorklogId=315732&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315732 ] ASF GitHub Bot logged work on BEAM-8003: Author: ASF GitHub Bot Created on: 20/Sep/19 15:08 Start Date: 20/Sep/19 15:08 Worklog Time Spent: 10m Work Description: aromanenko-dev commented on pull request #9450: [BEAM-8003] Remove Perfkit leftovers URL: https://github.com/apache/beam/pull/9450 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315732) Time Spent: 3h 50m (was: 3h 40m) > Remove all mentions of PKB on Confluence / website docs > --- > > Key: BEAM-8003 > URL: https://issues.apache.org/jira/browse/BEAM-8003 > Project: Beam > Issue Type: Sub-task > Components: testing, website >Reporter: Lukasz Gajowy >Assignee: Lukasz Gajowy >Priority: Major > Time Spent: 3h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8003) Remove all mentions of PKB on Confluence / website docs
[ https://issues.apache.org/jira/browse/BEAM-8003?focusedWorklogId=315731&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315731 ] ASF GitHub Bot logged work on BEAM-8003: Author: ASF GitHub Bot Created on: 20/Sep/19 15:07 Start Date: 20/Sep/19 15:07 Worklog Time Spent: 10m Work Description: aromanenko-dev commented on issue #9450: [BEAM-8003] Remove Perfkit leftovers URL: https://github.com/apache/beam/pull/9450#issuecomment-533592411 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315731) Time Spent: 3h 40m (was: 3.5h) > Remove all mentions of PKB on Confluence / website docs > --- > > Key: BEAM-8003 > URL: https://issues.apache.org/jira/browse/BEAM-8003 > Project: Beam > Issue Type: Sub-task > Components: testing, website >Reporter: Lukasz Gajowy >Assignee: Lukasz Gajowy >Priority: Major > Time Spent: 3h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-7920) AvroTableProvider
[ https://issues.apache.org/jira/browse/BEAM-7920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neville Li resolved BEAM-7920. -- Fix Version/s: 2.17.0 Resolution: Duplicate > AvroTableProvider > - > > Key: BEAM-7920 > URL: https://issues.apache.org/jira/browse/BEAM-7920 > Project: Beam > Issue Type: Improvement > Components: dsl-sql, io-java-avro >Affects Versions: 2.14.0 >Reporter: Neville Li >Assignee: Neville Li >Priority: Minor > Fix For: 2.17.0 > > > https://github.com/apache/beam/pull/6777 and BEAM-5807 mentioned > {{AvroTableProvider}} but I don't see one in the code base. Is this to be > implemented or am I missing something? > cc [~kanterov] > My WIP branch: https://github.com/spotify/beam/tree/neville/avro -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7660) Create ParDo Python Load Test Jenkins Job [Flink]
[ https://issues.apache.org/jira/browse/BEAM-7660?focusedWorklogId=315708&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315708 ] ASF GitHub Bot logged work on BEAM-7660: Author: ASF GitHub Bot Created on: 20/Sep/19 14:08 Start Date: 20/Sep/19 14:08 Worklog Time Spent: 10m Work Description: kamilwu commented on issue #9449: [BEAM-7660] Create Python ParDo load test job on Flink URL: https://github.com/apache/beam/pull/9449#issuecomment-533569443 Run Python Load Tests ParDo Flink Batch This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315708) Time Spent: 4h (was: 3h 50m) > Create ParDo Python Load Test Jenkins Job [Flink] > - > > Key: BEAM-7660 > URL: https://issues.apache.org/jira/browse/BEAM-7660 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Kamil Wasilewski >Assignee: Kamil Wasilewski >Priority: Major > Time Spent: 4h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8025) Cassandra IO classMethod test is flaky
[ https://issues.apache.org/jira/browse/BEAM-8025?focusedWorklogId=315697&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315697 ] ASF GitHub Bot logged work on BEAM-8025: Author: ASF GitHub Bot Created on: 20/Sep/19 13:59 Start Date: 20/Sep/19 13:59 Worklog Time Spent: 10m Work Description: echauchot commented on issue #9614: [BEAM-8025] Avoid embedded cluster clean at startup to avoid race condition in cleaning temp cassandra directory with the TemporaryFolder ClassRule. URL: https://github.com/apache/beam/pull/9614#issuecomment-533565710 The exception is thrown from [here](https://github.com/datastax/java-driver/blob/46a4e7c73c75946073559681e1ad215fa919e00d/driver-core/src/main/java/com/datastax/driver/core/ControlConnection.java#L267) which is in datastax driver core module. This exception is thrown when DefaultConvictionPolicy.canReconnectNow() is false on the host. I must admit I don't know why in some cases canReconnectNow() can be false at the first connection. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315697) Time Spent: 1h 50m (was: 1h 40m) > Cassandra IO classMethod test is flaky > -- > > Key: BEAM-8025 > URL: https://issues.apache.org/jira/browse/BEAM-8025 > Project: Beam > Issue Type: Bug > Components: io-java-cassandra, test-failures >Affects Versions: 2.16.0 >Reporter: Kyle Weaver >Assignee: Etienne Chauchot >Priority: Blocker > Time Spent: 1h 50m > Remaining Estimate: 0h > > The most recent runs of this test are failing: > [https://builds.apache.org/job/beam_PreCommit_Java_Commit/7398/] > [https://builds.apache.org/job/beam_PreCommit_Java_Commit/7399/console] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7660) Create ParDo Python Load Test Jenkins Job [Flink]
[ https://issues.apache.org/jira/browse/BEAM-7660?focusedWorklogId=315690&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315690 ] ASF GitHub Bot logged work on BEAM-7660: Author: ASF GitHub Bot Created on: 20/Sep/19 13:54 Start Date: 20/Sep/19 13:54 Worklog Time Spent: 10m Work Description: kamilwu commented on issue #9449: [BEAM-7660] Create Python ParDo load test job on Flink URL: https://github.com/apache/beam/pull/9449#issuecomment-533563717 Run Seed Job This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315690) Time Spent: 3h 50m (was: 3h 40m) > Create ParDo Python Load Test Jenkins Job [Flink] > - > > Key: BEAM-7660 > URL: https://issues.apache.org/jira/browse/BEAM-7660 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Kamil Wasilewski >Assignee: Kamil Wasilewski >Priority: Major > Time Spent: 3h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7660) Create ParDo Python Load Test Jenkins Job [Flink]
[ https://issues.apache.org/jira/browse/BEAM-7660?focusedWorklogId=315686&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315686 ] ASF GitHub Bot logged work on BEAM-7660: Author: ASF GitHub Bot Created on: 20/Sep/19 13:53 Start Date: 20/Sep/19 13:53 Worklog Time Spent: 10m Work Description: kamilwu commented on pull request #9449: [BEAM-7660] Create Python ParDo load test job on Flink URL: https://github.com/apache/beam/pull/9449#discussion_r326637049 ## File path: .test-infra/jenkins/job_LoadTests_ParDo_Flink_Python.groovy ## @@ -0,0 +1,150 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import CommonJobProperties as commonJobProperties +import CommonTestProperties +import LoadTestsBuilder as loadTestsBuilder +import PhraseTriggeringPostCommitBuilder +import Flink +import Docker + +String now = new Date().format("MMddHHmmss", TimeZone.getTimeZone('UTC')) + +def scenarios = { datasetName, sdkHarnessImageTag -> [ +[ +title: 'ParDo Python Load test: 2GB 100 byte records 10 times', +itClass : 'apache_beam.testing.load_tests.pardo_test:ParDoTest.testParDo', Review comment: +1 for changing `itClass` to `test`. > Maybe we can provide a common type (class) for such places I like this idea. I've already tested such class and I believe it'll make Jenkins jobs definitions more bullet-proof. I think I could introduce it in the next PR. Maybe I'll combine it with changes around `LoadTestBuilder` class... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315686) Time Spent: 3h 40m (was: 3.5h) > Create ParDo Python Load Test Jenkins Job [Flink] > - > > Key: BEAM-7660 > URL: https://issues.apache.org/jira/browse/BEAM-7660 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Kamil Wasilewski >Assignee: Kamil Wasilewski >Priority: Major > Time Spent: 3h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7660) Create ParDo Python Load Test Jenkins Job [Flink]
[ https://issues.apache.org/jira/browse/BEAM-7660?focusedWorklogId=315685&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315685 ] ASF GitHub Bot logged work on BEAM-7660: Author: ASF GitHub Bot Created on: 20/Sep/19 13:51 Start Date: 20/Sep/19 13:51 Worklog Time Spent: 10m Work Description: kamilwu commented on pull request #9449: [BEAM-7660] Create Python ParDo load test job on Flink URL: https://github.com/apache/beam/pull/9449#discussion_r326637823 ## File path: .test-infra/jenkins/job_LoadTests_ParDo_Flink_Python.groovy ## @@ -0,0 +1,150 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import CommonJobProperties as commonJobProperties +import CommonTestProperties +import LoadTestsBuilder as loadTestsBuilder +import PhraseTriggeringPostCommitBuilder +import Flink +import Docker + +String now = new Date().format("MMddHHmmss", TimeZone.getTimeZone('UTC')) + +def scenarios = { datasetName, sdkHarnessImageTag -> [ +[ +title: 'ParDo Python Load test: 2GB 100 byte records 10 times', +itClass : 'apache_beam.testing.load_tests.pardo_test:ParDoTest.testParDo', +runner : CommonTestProperties.Runner.PORTABLE, +jobProperties: [ +job_name : 'load-tests-python-flink-batch-pardo-1-' + now, +project : 'apache-beam-testing', +publish_to_big_query : false, +metrics_dataset : datasetName, +metrics_table: 'python_flink_batch_pardo_1', +input_options: '\'{' + +'"num_records": 2000,' + +'"key_size": 10,' + +'"value_size": 90}\'', +iterations : 10, +number_of_counter_operations: 0, +number_of_counters : 0, +parallelism : 5, +job_endpoint : 'localhost:8099', +environment_config : sdkHarnessImageTag, +environment_type : 'DOCKER', +] +], +[ +title: 'ParDo Python Load test: 2GB 100 byte records 200 times', +itClass : 'apache_beam.testing.load_tests.pardo_test:ParDoTest.testParDo', +runner : CommonTestProperties.Runner.PORTABLE, +jobProperties: [ +job_name : 'load-tests-python-flink-batch-pardo-2-' + now, +project : 'apache-beam-testing', +publish_to_big_query : false, +metrics_dataset : datasetName, +metrics_table: 'python_flink_batch_pardo_2', +input_options: '\'{' + +'"num_records": 2000,' + +'"key_size": 10,' + +'"value_size": 90}\'', +iterations : 200, +number_of_counter_operations: 0, +number_of_counters : 0, +parallelism : 5, +job_endpoint : 'localhost:8099', +environment_config : sdkHarnessImageTag, +environment_type : 'DOCKER', +] +], +[ +title: 'ParDo Python Load test: 2GB 100 byte records 10 counters', +itClass : 'apache_beam.testing.load_tests.pardo_test:ParDoTest.testParDo', +runner : CommonTestProperties.Runner.PORTABLE, +jobProperties: [ +job_name : 'load-tests-python-flink-batch-pardo-3-' + now, +project : 'apache-beam-testing', +
[jira] [Work logged] (BEAM-7660) Create ParDo Python Load Test Jenkins Job [Flink]
[ https://issues.apache.org/jira/browse/BEAM-7660?focusedWorklogId=315679&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315679 ] ASF GitHub Bot logged work on BEAM-7660: Author: ASF GitHub Bot Created on: 20/Sep/19 13:49 Start Date: 20/Sep/19 13:49 Worklog Time Spent: 10m Work Description: kamilwu commented on pull request #9449: [BEAM-7660] Create Python ParDo load test job on Flink URL: https://github.com/apache/beam/pull/9449#discussion_r326637049 ## File path: .test-infra/jenkins/job_LoadTests_ParDo_Flink_Python.groovy ## @@ -0,0 +1,150 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import CommonJobProperties as commonJobProperties +import CommonTestProperties +import LoadTestsBuilder as loadTestsBuilder +import PhraseTriggeringPostCommitBuilder +import Flink +import Docker + +String now = new Date().format("MMddHHmmss", TimeZone.getTimeZone('UTC')) + +def scenarios = { datasetName, sdkHarnessImageTag -> [ +[ +title: 'ParDo Python Load test: 2GB 100 byte records 10 times', +itClass : 'apache_beam.testing.load_tests.pardo_test:ParDoTest.testParDo', Review comment: +1 for changing `itClass` to `test`. > Maybe we can provide a common type (class) for such places I like this idea. I've already tested such class, but I think I could introduce it in the next PR. Maybe I'll combine it with changes around `LoadTestBuilder` class. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315679) Time Spent: 3h 20m (was: 3h 10m) > Create ParDo Python Load Test Jenkins Job [Flink] > - > > Key: BEAM-7660 > URL: https://issues.apache.org/jira/browse/BEAM-7660 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Kamil Wasilewski >Assignee: Kamil Wasilewski >Priority: Major > Time Spent: 3h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (BEAM-8113) FlinkRunner: Stage files from context classloader
[ https://issues.apache.org/jira/browse/BEAM-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Lukavský closed BEAM-8113. -- Fix Version/s: Not applicable Resolution: Won't Fix Needs a different approach. > FlinkRunner: Stage files from context classloader > - > > Key: BEAM-8113 > URL: https://issues.apache.org/jira/browse/BEAM-8113 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Jan Lukavský >Assignee: Jan Lukavský >Priority: Major > Fix For: Not applicable > > Time Spent: 8.5h > Remaining Estimate: 0h > > Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged > by default. Add also files from > {{Thread.currentThread().getContextClassLoader()}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8003) Remove all mentions of PKB on Confluence / website docs
[ https://issues.apache.org/jira/browse/BEAM-8003?focusedWorklogId=315675&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315675 ] ASF GitHub Bot logged work on BEAM-8003: Author: ASF GitHub Bot Created on: 20/Sep/19 13:45 Start Date: 20/Sep/19 13:45 Worklog Time Spent: 10m Work Description: aromanenko-dev commented on issue #9450: [BEAM-8003] Remove Perfkit leftovers URL: https://github.com/apache/beam/pull/9450#issuecomment-533475622 Run JavaBeamZetaSQL PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315675) Time Spent: 3h 20m (was: 3h 10m) > Remove all mentions of PKB on Confluence / website docs > --- > > Key: BEAM-8003 > URL: https://issues.apache.org/jira/browse/BEAM-8003 > Project: Beam > Issue Type: Sub-task > Components: testing, website >Reporter: Lukasz Gajowy >Assignee: Lukasz Gajowy >Priority: Major > Time Spent: 3h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8003) Remove all mentions of PKB on Confluence / website docs
[ https://issues.apache.org/jira/browse/BEAM-8003?focusedWorklogId=315676&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315676 ] ASF GitHub Bot logged work on BEAM-8003: Author: ASF GitHub Bot Created on: 20/Sep/19 13:45 Start Date: 20/Sep/19 13:45 Worklog Time Spent: 10m Work Description: aromanenko-dev commented on issue #9450: [BEAM-8003] Remove Perfkit leftovers URL: https://github.com/apache/beam/pull/9450#issuecomment-533475741 Run Portable_Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315676) Time Spent: 3.5h (was: 3h 20m) > Remove all mentions of PKB on Confluence / website docs > --- > > Key: BEAM-8003 > URL: https://issues.apache.org/jira/browse/BEAM-8003 > Project: Beam > Issue Type: Sub-task > Components: testing, website >Reporter: Lukasz Gajowy >Assignee: Lukasz Gajowy >Priority: Major > Time Spent: 3.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8003) Remove all mentions of PKB on Confluence / website docs
[ https://issues.apache.org/jira/browse/BEAM-8003?focusedWorklogId=315673&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315673 ] ASF GitHub Bot logged work on BEAM-8003: Author: ASF GitHub Bot Created on: 20/Sep/19 13:45 Start Date: 20/Sep/19 13:45 Worklog Time Spent: 10m Work Description: aromanenko-dev commented on issue #9450: [BEAM-8003] Remove Perfkit leftovers URL: https://github.com/apache/beam/pull/9450#issuecomment-533560061 Run Portable_Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315673) Time Spent: 3h (was: 2h 50m) > Remove all mentions of PKB on Confluence / website docs > --- > > Key: BEAM-8003 > URL: https://issues.apache.org/jira/browse/BEAM-8003 > Project: Beam > Issue Type: Sub-task > Components: testing, website >Reporter: Lukasz Gajowy >Assignee: Lukasz Gajowy >Priority: Major > Time Spent: 3h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8003) Remove all mentions of PKB on Confluence / website docs
[ https://issues.apache.org/jira/browse/BEAM-8003?focusedWorklogId=315674&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315674 ] ASF GitHub Bot logged work on BEAM-8003: Author: ASF GitHub Bot Created on: 20/Sep/19 13:45 Start Date: 20/Sep/19 13:45 Worklog Time Spent: 10m Work Description: aromanenko-dev commented on issue #9450: [BEAM-8003] Remove Perfkit leftovers URL: https://github.com/apache/beam/pull/9450#issuecomment-533475551 Run Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315674) Time Spent: 3h 10m (was: 3h) > Remove all mentions of PKB on Confluence / website docs > --- > > Key: BEAM-8003 > URL: https://issues.apache.org/jira/browse/BEAM-8003 > Project: Beam > Issue Type: Sub-task > Components: testing, website >Reporter: Lukasz Gajowy >Assignee: Lukasz Gajowy >Priority: Major > Time Spent: 3h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8113) FlinkRunner: Stage files from context classloader
[ https://issues.apache.org/jira/browse/BEAM-8113?focusedWorklogId=315672&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315672 ] ASF GitHub Bot logged work on BEAM-8113: Author: ASF GitHub Bot Created on: 20/Sep/19 13:44 Start Date: 20/Sep/19 13:44 Worklog Time Spent: 10m Work Description: je-ik commented on pull request #9451: [BEAM-8113] Stage files from context classloader URL: https://github.com/apache/beam/pull/9451 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315672) Time Spent: 8.5h (was: 8h 20m) > FlinkRunner: Stage files from context classloader > - > > Key: BEAM-8113 > URL: https://issues.apache.org/jira/browse/BEAM-8113 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Jan Lukavský >Assignee: Jan Lukavský >Priority: Major > Time Spent: 8.5h > Remaining Estimate: 0h > > Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged > by default. Add also files from > {{Thread.currentThread().getContextClassLoader()}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8113) FlinkRunner: Stage files from context classloader
[ https://issues.apache.org/jira/browse/BEAM-8113?focusedWorklogId=315671&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315671 ] ASF GitHub Bot logged work on BEAM-8113: Author: ASF GitHub Bot Created on: 20/Sep/19 13:44 Start Date: 20/Sep/19 13:44 Worklog Time Spent: 10m Work Description: je-ik commented on issue #9451: [BEAM-8113] Stage files from context classloader URL: https://github.com/apache/beam/pull/9451#issuecomment-533559750 I'm closing this, because it turns out, that even if merged as proposed, it will not solve what I was hoping for. I will propose a different approach. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315671) Time Spent: 8h 20m (was: 8h 10m) > FlinkRunner: Stage files from context classloader > - > > Key: BEAM-8113 > URL: https://issues.apache.org/jira/browse/BEAM-8113 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Jan Lukavský >Assignee: Jan Lukavský >Priority: Major > Time Spent: 8h 20m > Remaining Estimate: 0h > > Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged > by default. Add also files from > {{Thread.currentThread().getContextClassLoader()}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7660) Create ParDo Python Load Test Jenkins Job [Flink]
[ https://issues.apache.org/jira/browse/BEAM-7660?focusedWorklogId=315669&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315669 ] ASF GitHub Bot logged work on BEAM-7660: Author: ASF GitHub Bot Created on: 20/Sep/19 13:42 Start Date: 20/Sep/19 13:42 Worklog Time Spent: 10m Work Description: kamilwu commented on pull request #9449: [BEAM-7660] Create Python ParDo load test job on Flink URL: https://github.com/apache/beam/pull/9449#discussion_r326633757 ## File path: .test-infra/jenkins/job_LoadTests_ParDo_Flink_Python.groovy ## @@ -0,0 +1,150 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import CommonJobProperties as commonJobProperties +import CommonTestProperties +import LoadTestsBuilder as loadTestsBuilder +import PhraseTriggeringPostCommitBuilder +import Flink +import Docker + +String now = new Date().format("MMddHHmmss", TimeZone.getTimeZone('UTC')) + +def scenarios = { datasetName, sdkHarnessImageTag -> [ +[ +title: 'ParDo Python Load test: 2GB 100 byte records 10 times', +itClass : 'apache_beam.testing.load_tests.pardo_test:ParDoTest.testParDo', +runner : CommonTestProperties.Runner.PORTABLE, +jobProperties: [ +job_name : 'load-tests-python-flink-batch-pardo-1-' + now, +project : 'apache-beam-testing', +publish_to_big_query : false, +metrics_dataset : datasetName, +metrics_table: 'python_flink_batch_pardo_1', +input_options: '\'{' + +'"num_records": 2000,' + +'"key_size": 10,' + +'"value_size": 90}\'', +iterations : 10, +number_of_counter_operations: 0, +number_of_counters : 0, +parallelism : 5, +job_endpoint : 'localhost:8099', +environment_config : sdkHarnessImageTag, +environment_type : 'DOCKER', +] +], +[ +title: 'ParDo Python Load test: 2GB 100 byte records 200 times', +itClass : 'apache_beam.testing.load_tests.pardo_test:ParDoTest.testParDo', +runner : CommonTestProperties.Runner.PORTABLE, +jobProperties: [ +job_name : 'load-tests-python-flink-batch-pardo-2-' + now, Review comment: For me, appending string suffixes via '+' operator is a tiny bit more cleaner. I don't think we need to replace this with the ${} syntax. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315669) Time Spent: 3h (was: 2h 50m) > Create ParDo Python Load Test Jenkins Job [Flink] > - > > Key: BEAM-7660 > URL: https://issues.apache.org/jira/browse/BEAM-7660 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Kamil Wasilewski >Assignee: Kamil Wasilewski >Priority: Major > Time Spent: 3h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7660) Create ParDo Python Load Test Jenkins Job [Flink]
[ https://issues.apache.org/jira/browse/BEAM-7660?focusedWorklogId=315670&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315670 ] ASF GitHub Bot logged work on BEAM-7660: Author: ASF GitHub Bot Created on: 20/Sep/19 13:42 Start Date: 20/Sep/19 13:42 Worklog Time Spent: 10m Work Description: kamilwu commented on pull request #9449: [BEAM-7660] Create Python ParDo load test job on Flink URL: https://github.com/apache/beam/pull/9449#discussion_r326633823 ## File path: .test-infra/jenkins/job_LoadTests_ParDo_Flink_Python.groovy ## @@ -0,0 +1,150 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import CommonJobProperties as commonJobProperties +import CommonTestProperties +import LoadTestsBuilder as loadTestsBuilder +import PhraseTriggeringPostCommitBuilder +import Flink +import Docker + +String now = new Date().format("MMddHHmmss", TimeZone.getTimeZone('UTC')) + +def scenarios = { datasetName, sdkHarnessImageTag -> [ +[ +title: 'ParDo Python Load test: 2GB 100 byte records 10 times', +itClass : 'apache_beam.testing.load_tests.pardo_test:ParDoTest.testParDo', +runner : CommonTestProperties.Runner.PORTABLE, +jobProperties: [ Review comment: +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315670) Time Spent: 3h 10m (was: 3h) > Create ParDo Python Load Test Jenkins Job [Flink] > - > > Key: BEAM-7660 > URL: https://issues.apache.org/jira/browse/BEAM-7660 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Kamil Wasilewski >Assignee: Kamil Wasilewski >Priority: Major > Time Spent: 3h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7520) DirectRunner timers are not strictly time ordered
[ https://issues.apache.org/jira/browse/BEAM-7520?focusedWorklogId=315668&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315668 ] ASF GitHub Bot logged work on BEAM-7520: Author: ASF GitHub Bot Created on: 20/Sep/19 13:41 Start Date: 20/Sep/19 13:41 Worklog Time Spent: 10m Work Description: je-ik commented on issue #9190: [BEAM-7520] Fix timer firing order in DirectRunner URL: https://github.com/apache/beam/pull/9190#issuecomment-533558587 @robertwb Would you have time to look into this please? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315668) Time Spent: 9.5h (was: 9h 20m) > DirectRunner timers are not strictly time ordered > - > > Key: BEAM-7520 > URL: https://issues.apache.org/jira/browse/BEAM-7520 > Project: Beam > Issue Type: Bug > Components: runner-direct >Affects Versions: 2.13.0 >Reporter: Jan Lukavský >Assignee: Jan Lukavský >Priority: Major > Time Spent: 9.5h > Remaining Estimate: 0h > > Let's suppose we have the following situation: > - statful ParDo with two timers - timerA and timerB > - timerA is set for window.maxTimestamp() + 1 > - timerB is set anywhere between timerB.timestamp > - input watermark moves to BoundedWindow.TIMESTAMP_MAX_VALUE > Then the order of timers is as follows (correct): > - timerB > - timerA > But, if timerB sets another timer (say for timerB.timestamp + 1), then the > order of timers will be: > - timerB (timerB.timestamp) > - timerA (BoundedWindow.TIMESTAMP_MAX_VALUE) > - timerB (timerB.timestamp + 1) > Which is not ordered by timestamp. The reason for this is that when the input > watermark update is evaluated, the WatermarkManager,extractFiredTimers() will > produce both timerA and timerB. That would be correct, but when timerB sets > another timer, that breaks this. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-6855) Side inputs are not supported when using the state API
[ https://issues.apache.org/jira/browse/BEAM-6855?focusedWorklogId=315645&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315645 ] ASF GitHub Bot logged work on BEAM-6855: Author: ASF GitHub Bot Created on: 20/Sep/19 12:57 Start Date: 20/Sep/19 12:57 Worklog Time Spent: 10m Work Description: salmanVD commented on issue #9612: [BEAM-6855] Side inputs are not supported when using the state API URL: https://github.com/apache/beam/pull/9612#issuecomment-533543490 R: @reuvenlax This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315645) Time Spent: 5h 10m (was: 5h) > Side inputs are not supported when using the state API > -- > > Key: BEAM-6855 > URL: https://issues.apache.org/jira/browse/BEAM-6855 > Project: Beam > Issue Type: Bug > Components: runner-core, runner-dataflow, runner-direct >Reporter: Reuven Lax >Assignee: Shehzaad Nakhoda >Priority: Major > Time Spent: 5h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-8078) streaming_wordcount_debugging.py is missing a test
[ https://issues.apache.org/jira/browse/BEAM-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Vysotin reassigned BEAM-8078: - Assignee: Aleksey Vysotin > streaming_wordcount_debugging.py is missing a test > -- > > Key: BEAM-8078 > URL: https://issues.apache.org/jira/browse/BEAM-8078 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core >Reporter: Udi Meiri >Assignee: Aleksey Vysotin >Priority: Minor > Labels: beginner, easy, newbie, starter > > It's example code and should have a basic_test (like the other wordcount > variants in [1]) to at least verify that it runs in the latest Beam release. > [1] > https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (BEAM-8078) streaming_wordcount_debugging.py is missing a test
[ https://issues.apache.org/jira/browse/BEAM-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on BEAM-8078 started by Aleksey Vysotin. - > streaming_wordcount_debugging.py is missing a test > -- > > Key: BEAM-8078 > URL: https://issues.apache.org/jira/browse/BEAM-8078 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core >Reporter: Udi Meiri >Assignee: Aleksey Vysotin >Priority: Minor > Labels: beginner, easy, newbie, starter > > It's example code and should have a basic_test (like the other wordcount > variants in [1]) to at least verify that it runs in the latest Beam release. > [1] > https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7223) Add ValidatesRunner test suite for Flink on Python 3.
[ https://issues.apache.org/jira/browse/BEAM-7223?focusedWorklogId=315561&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315561 ] ASF GitHub Bot logged work on BEAM-7223: Author: ASF GitHub Bot Created on: 20/Sep/19 10:04 Start Date: 20/Sep/19 10:04 Worklog Time Spent: 10m Work Description: stale[bot] commented on issue #8877: [BEAM-7223] Add ValidatesRunner for Flink for python 3.5 - 3.6 - 3.7 URL: https://github.com/apache/beam/pull/8877#issuecomment-533492927 This pull request has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315561) Time Spent: 2h 50m (was: 2h 40m) > Add ValidatesRunner test suite for Flink on Python 3. > - > > Key: BEAM-7223 > URL: https://issues.apache.org/jira/browse/BEAM-7223 > Project: Beam > Issue Type: Sub-task > Components: runner-flink >Reporter: Ankur Goenka >Assignee: Valentyn Tymofieiev >Priority: Major > Fix For: 2.17.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > Add py3 integration tests for Flink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7223) Add ValidatesRunner test suite for Flink on Python 3.
[ https://issues.apache.org/jira/browse/BEAM-7223?focusedWorklogId=315562&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315562 ] ASF GitHub Bot logged work on BEAM-7223: Author: ASF GitHub Bot Created on: 20/Sep/19 10:04 Start Date: 20/Sep/19 10:04 Worklog Time Spent: 10m Work Description: stale[bot] commented on pull request #8877: [BEAM-7223] Add ValidatesRunner for Flink for python 3.5 - 3.6 - 3.7 URL: https://github.com/apache/beam/pull/8877 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315562) Time Spent: 3h (was: 2h 50m) > Add ValidatesRunner test suite for Flink on Python 3. > - > > Key: BEAM-7223 > URL: https://issues.apache.org/jira/browse/BEAM-7223 > Project: Beam > Issue Type: Sub-task > Components: runner-flink >Reporter: Ankur Goenka >Assignee: Valentyn Tymofieiev >Priority: Major > Fix For: 2.17.0 > > Time Spent: 3h > Remaining Estimate: 0h > > Add py3 integration tests for Flink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8003) Remove all mentions of PKB on Confluence / website docs
[ https://issues.apache.org/jira/browse/BEAM-8003?focusedWorklogId=315539&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315539 ] ASF GitHub Bot logged work on BEAM-8003: Author: ASF GitHub Bot Created on: 20/Sep/19 09:12 Start Date: 20/Sep/19 09:12 Worklog Time Spent: 10m Work Description: aromanenko-dev commented on issue #9450: [BEAM-8003] Remove Perfkit leftovers URL: https://github.com/apache/beam/pull/9450#issuecomment-533475622 Run JavaBeamZetaSQL PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315539) Time Spent: 2h 40m (was: 2.5h) > Remove all mentions of PKB on Confluence / website docs > --- > > Key: BEAM-8003 > URL: https://issues.apache.org/jira/browse/BEAM-8003 > Project: Beam > Issue Type: Sub-task > Components: testing, website >Reporter: Lukasz Gajowy >Assignee: Lukasz Gajowy >Priority: Major > Time Spent: 2h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8003) Remove all mentions of PKB on Confluence / website docs
[ https://issues.apache.org/jira/browse/BEAM-8003?focusedWorklogId=315538&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315538 ] ASF GitHub Bot logged work on BEAM-8003: Author: ASF GitHub Bot Created on: 20/Sep/19 09:12 Start Date: 20/Sep/19 09:12 Worklog Time Spent: 10m Work Description: aromanenko-dev commented on issue #9450: [BEAM-8003] Remove Perfkit leftovers URL: https://github.com/apache/beam/pull/9450#issuecomment-533475551 Run Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315538) Time Spent: 2.5h (was: 2h 20m) > Remove all mentions of PKB on Confluence / website docs > --- > > Key: BEAM-8003 > URL: https://issues.apache.org/jira/browse/BEAM-8003 > Project: Beam > Issue Type: Sub-task > Components: testing, website >Reporter: Lukasz Gajowy >Assignee: Lukasz Gajowy >Priority: Major > Time Spent: 2.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8003) Remove all mentions of PKB on Confluence / website docs
[ https://issues.apache.org/jira/browse/BEAM-8003?focusedWorklogId=315541&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315541 ] ASF GitHub Bot logged work on BEAM-8003: Author: ASF GitHub Bot Created on: 20/Sep/19 09:12 Start Date: 20/Sep/19 09:12 Worklog Time Spent: 10m Work Description: aromanenko-dev commented on issue #9450: [BEAM-8003] Remove Perfkit leftovers URL: https://github.com/apache/beam/pull/9450#issuecomment-533475741 Run Portable_Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315541) Time Spent: 2h 50m (was: 2h 40m) > Remove all mentions of PKB on Confluence / website docs > --- > > Key: BEAM-8003 > URL: https://issues.apache.org/jira/browse/BEAM-8003 > Project: Beam > Issue Type: Sub-task > Components: testing, website >Reporter: Lukasz Gajowy >Assignee: Lukasz Gajowy >Priority: Major > Time Spent: 2h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8003) Remove all mentions of PKB on Confluence / website docs
[ https://issues.apache.org/jira/browse/BEAM-8003?focusedWorklogId=315529&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315529 ] ASF GitHub Bot logged work on BEAM-8003: Author: ASF GitHub Bot Created on: 20/Sep/19 08:37 Start Date: 20/Sep/19 08:37 Worklog Time Spent: 10m Work Description: lgajowy commented on issue #9450: [BEAM-8003] Remove Perfkit leftovers URL: https://github.com/apache/beam/pull/9450#issuecomment-533463891 @tvalentyn yes, thank you for asking. Python tests are the last ones that use PKB. There is a common ticket for removing Perfkit on Jira: [BEAM-7772](https://issues.apache.org/jira/browse/BEAM-7772) Removing PKB from Python suites is tracked here (subtask): [BEAM-7774](https://issues.apache.org/jira/browse/BEAM-7774) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 315529) Time Spent: 2h 20m (was: 2h 10m) > Remove all mentions of PKB on Confluence / website docs > --- > > Key: BEAM-8003 > URL: https://issues.apache.org/jira/browse/BEAM-8003 > Project: Beam > Issue Type: Sub-task > Components: testing, website >Reporter: Lukasz Gajowy >Assignee: Lukasz Gajowy >Priority: Major > Time Spent: 2h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)