[jira] [Work logged] (BEAM-6855) Side inputs are not supported when using the state API
[ https://issues.apache.org/jira/browse/BEAM-6855?focusedWorklogId=321060&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321060 ] ASF GitHub Bot logged work on BEAM-6855: Author: ASF GitHub Bot Created on: 01/Oct/19 05:50 Start Date: 01/Oct/19 05:50 Worklog Time Spent: 10m Work Description: salmanVD commented on issue #9612: [BEAM-6855] Side inputs are not supported when using the state API URL: https://github.com/apache/beam/pull/9612#issuecomment-536877287 Run Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 321060) Time Spent: 6h 20m (was: 6h 10m) > Side inputs are not supported when using the state API > -- > > Key: BEAM-6855 > URL: https://issues.apache.org/jira/browse/BEAM-6855 > Project: Beam > Issue Type: Bug > Components: runner-core, runner-dataflow, runner-direct >Reporter: Reuven Lax >Assignee: Shehzaad Nakhoda >Priority: Major > Time Spent: 6h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-6855) Side inputs are not supported when using the state API
[ https://issues.apache.org/jira/browse/BEAM-6855?focusedWorklogId=321059&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321059 ] ASF GitHub Bot logged work on BEAM-6855: Author: ASF GitHub Bot Created on: 01/Oct/19 05:23 Start Date: 01/Oct/19 05:23 Worklog Time Spent: 10m Work Description: salmanVD commented on issue #9612: [BEAM-6855] Side inputs are not supported when using the state API URL: https://github.com/apache/beam/pull/9612#issuecomment-536870195 Run Portable_Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 321059) Time Spent: 6h 10m (was: 6h) > Side inputs are not supported when using the state API > -- > > Key: BEAM-6855 > URL: https://issues.apache.org/jira/browse/BEAM-6855 > Project: Beam > Issue Type: Bug > Components: runner-core, runner-dataflow, runner-direct >Reporter: Reuven Lax >Assignee: Shehzaad Nakhoda >Priority: Major > Time Spent: 6h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5878) Support DoFns with Keyword-only arguments in Python 3.
[ https://issues.apache.org/jira/browse/BEAM-5878?focusedWorklogId=321025&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321025 ] ASF GitHub Bot logged work on BEAM-5878: Author: ASF GitHub Bot Created on: 01/Oct/19 02:04 Start Date: 01/Oct/19 02:04 Worklog Time Spent: 10m Work Description: lazylynx commented on pull request #9686: [WIP][BEAM-5878] update dill min version to 0.3.1.1 and add test for functions with Keyword-only arguments URL: https://github.com/apache/beam/pull/9686 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 321025) Time Spent: 15h (was: 14h 50m) > Support DoFns with Keyword-only arguments in Python 3. > -- > > Key: BEAM-5878 > URL: https://issues.apache.org/jira/browse/BEAM-5878 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: yoshiki obata >Priority: Minor > Fix For: 2.17.0 > > Time Spent: 15h > Remaining Estimate: 0h > > Python 3.0 [adds a possibility|https://www.python.org/dev/peps/pep-3102/] to > define functions with keyword-only arguments. > Currently Beam does not handle them correctly. [~ruoyu] pointed out [one > place|https://github.com/apache/beam/blob/a56ce43109c97c739fa08adca45528c41e3c925c/sdks/python/apache_beam/typehints/decorators.py#L118] > in our codebase that we should fix: in Python in 3.0 inspect.getargspec() > will fail on functions with keyword-only arguments, but a new method > [inspect.getfullargspec()|https://docs.python.org/3/library/inspect.html#inspect.getfullargspec] > supports them. > There may be implications for our (best-effort) type-hints machinery. > We should also add a Py3-only unit tests that covers DoFn's with keyword-only > arguments once Beam Python 3 tests are in a good shape. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5878) Support DoFns with Keyword-only arguments in Python 3.
[ https://issues.apache.org/jira/browse/BEAM-5878?focusedWorklogId=321024&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321024 ] ASF GitHub Bot logged work on BEAM-5878: Author: ASF GitHub Bot Created on: 01/Oct/19 02:04 Start Date: 01/Oct/19 02:04 Worklog Time Spent: 10m Work Description: lazylynx commented on issue #9686: [WIP][BEAM-5878] update dill min version to 0.3.1.1 and add test for functions with Keyword-only arguments URL: https://github.com/apache/beam/pull/9686#issuecomment-536826908 Run Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 321024) Time Spent: 14h 50m (was: 14h 40m) > Support DoFns with Keyword-only arguments in Python 3. > -- > > Key: BEAM-5878 > URL: https://issues.apache.org/jira/browse/BEAM-5878 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: yoshiki obata >Priority: Minor > Fix For: 2.17.0 > > Time Spent: 14h 50m > Remaining Estimate: 0h > > Python 3.0 [adds a possibility|https://www.python.org/dev/peps/pep-3102/] to > define functions with keyword-only arguments. > Currently Beam does not handle them correctly. [~ruoyu] pointed out [one > place|https://github.com/apache/beam/blob/a56ce43109c97c739fa08adca45528c41e3c925c/sdks/python/apache_beam/typehints/decorators.py#L118] > in our codebase that we should fix: in Python in 3.0 inspect.getargspec() > will fail on functions with keyword-only arguments, but a new method > [inspect.getfullargspec()|https://docs.python.org/3/library/inspect.html#inspect.getfullargspec] > supports them. > There may be implications for our (best-effort) type-hints machinery. > We should also add a Py3-only unit tests that covers DoFn's with keyword-only > arguments once Beam Python 3 tests are in a good shape. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7389) Colab examples for element-wise transforms (Python)
[ https://issues.apache.org/jira/browse/BEAM-7389?focusedWorklogId=320984&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320984 ] ASF GitHub Bot logged work on BEAM-7389: Author: ASF GitHub Bot Created on: 01/Oct/19 00:06 Start Date: 01/Oct/19 00:06 Worklog Time Spent: 10m Work Description: davidcavazos commented on issue #9669: [BEAM-7389] Update include buttons to support multiple languages URL: https://github.com/apache/beam/pull/9669#issuecomment-536801256 I'm actually regularizing some names in #9664 alongside #9692. Basically we would need to rename everything to `elementwise` since `element-wise` is not permitted in Python, that's why it ended up being `element_wise`. In java we'll also have a similar problem unless we go for `elementwise`. That would imply moving/renaming all the website source files as well as Python source files to read the new directory names. That wouldn't cause any problems if done in two stages as the PRs I mentioned above. It would just be two large PRs even if they don't change any of the content. I like the idea since I would like to keep things consistent. Just to make things clear, here is what would have to be done: 1. PR1: Create new `elementwise` directory. * **Copy** over the `element_wise` directory into `elementwise`. * Make sure all tests pass * We would have (temporarily) two copies of the code snippets. 1. PR2: Update the docs. * **Move** the docs from `element-wise` to `elementwise`. * **Update** all code snippets to point to the new `elementwise` Python directory. * Regenerate notebooks * **Delete** old `element_wise` Python directory. These would be big PRs even if there is no actual content change. @aaltay How about we do these changes in the open PRs we already have for this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320984) Time Spent: 61h 50m (was: 61h 40m) > Colab examples for element-wise transforms (Python) > --- > > Key: BEAM-7389 > URL: https://issues.apache.org/jira/browse/BEAM-7389 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Rose Nguyen >Assignee: David Cavazos >Priority: Minor > Time Spent: 61h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7389) Colab examples for element-wise transforms (Python)
[ https://issues.apache.org/jira/browse/BEAM-7389?focusedWorklogId=320981&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320981 ] ASF GitHub Bot logged work on BEAM-7389: Author: ASF GitHub Bot Created on: 01/Oct/19 00:04 Start Date: 01/Oct/19 00:04 Worklog Time Spent: 10m Work Description: davidcavazos commented on issue #9669: [BEAM-7389] Update include buttons to support multiple languages URL: https://github.com/apache/beam/pull/9669#issuecomment-536801256 I'm actually regularizing some names in #9664 alongside #9692. Basically we would need to rename everything to `elementwise` since `element-wise` is not permitted in Python, that's why it ended up being `element_wise`. In java we'll also have a similar problem unless we go for `elementwise`. That would imply moving/renaming all the website source files as well as Python source files to read the new directory names. That wouldn't cause any problems if done in two stages as the PRs I mentioned above. It would just be two large PRs even if they don't change any of the content. I like the idea since I would like to keep things consistent. Just to make things clear, here is what would have to be done: 1. PR1: Create new `elementwise` directory. * **Copy** over the `element_wise` directory into `elementwise`. * Make sure all tests pass * We would have (temporarily) two copies of the code snippets. 1. PR2: Update the docs. * **Move** the docs from `element-wise` to `elementwise`. * Update all code snippets to point to the new `elementwise` Python directory. * Regenerate notebooks * Delete old `element_wise` Python directory. These would be big PRs even if there is no actual content change. @aaltay How about we do these changes in the open PRs we already have for this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320981) Time Spent: 61h 40m (was: 61.5h) > Colab examples for element-wise transforms (Python) > --- > > Key: BEAM-7389 > URL: https://issues.apache.org/jira/browse/BEAM-7389 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Rose Nguyen >Assignee: David Cavazos >Priority: Minor > Time Spent: 61h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-7049) Merge multiple input to one BeamUnionRel
[ https://issues.apache.org/jira/browse/BEAM-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941410#comment-16941410 ] sridhar Reddy commented on BEAM-7049: - [~amaliujia] I just want to make sure that we are on the same page as far as the next steps are concerned. At this point, I am waiting for your instructions on how to proceed further: How to split UNION ALL and UNION and what aspects of SQL Optimizer should be I looking at. Please take your time to update the Jira. I just want to make sure we are not waiting for each other. > Merge multiple input to one BeamUnionRel > > > Key: BEAM-7049 > URL: https://issues.apache.org/jira/browse/BEAM-7049 > Project: Beam > Issue Type: Improvement > Components: dsl-sql >Reporter: Rui Wang >Assignee: sridhar Reddy >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > BeamUnionRel assumes inputs are two and rejects more. So `a UNION b UNION c` > will have to be created as UNION(a, UNION(b, c)) and have two shuffles. If > BeamUnionRel can handle multiple shuffles, we will have only one shuffle -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-8303) Filesystems not properly registered using FileIO.write()
[ https://issues.apache.org/jira/browse/BEAM-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941407#comment-16941407 ] Mark Liu commented on BEAM-8303: Looks like this issue happened in 2.15.0, thus it's not a regression for 2.16.0. According to the policy in https://beam.apache.org/contribute/release-guide/#4-triage-release-blocking-issues-in-jira, it's not necessary a blocker for 2.16.0. I'll move this issue to 2.17.0 but if it's possible, we can still back port it to 2.16.0. > Filesystems not properly registered using FileIO.write() > > > Key: BEAM-8303 > URL: https://issues.apache.org/jira/browse/BEAM-8303 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.15.0 >Reporter: Preston Koprivica >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.16.0 > > Time Spent: 50m > Remaining Estimate: 0h > > I’m getting the following error when attempting to use the FileIO apis > (beam-2.15.0) and integrating with AWS S3. I have setup the PipelineOptions > with all the relevant AWS options, so the filesystem registry **should** be > properly seeded by the time the graph is compiled and executed: > {code:java} > java.lang.IllegalArgumentException: No filesystem found for scheme s3 > at > org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.java:456) > at > org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:526) > at > org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1149) > at > org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1105) > at org.apache.beam.sdk.coders.Coder.decode(Coder.java:159) > at > org.apache.beam.sdk.transforms.join.UnionCoder.decode(UnionCoder.java:83) > at > org.apache.beam.sdk.transforms.join.UnionCoder.decode(UnionCoder.java:32) > at > org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:543) > at > org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:534) > at > org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:480) > at > org.apache.beam.runners.flink.translation.types.CoderTypeSerializer.deserialize(CoderTypeSerializer.java:93) > at > org.apache.flink.runtime.plugable.NonReusingDeserializationDelegate.read(NonReusingDeserializationDelegate.java:55) > at > org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.getNextRecord(SpillingAdaptiveSpanningRecordDeserializer.java:106) > at > org.apache.flink.runtime.io.network.api.reader.AbstractRecordReader.getNextRecord(AbstractRecordReader.java:72) > at > org.apache.flink.runtime.io.network.api.reader.MutableRecordReader.next(MutableRecordReader.java:47) > at > org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:73) > at > org.apache.flink.runtime.operators.FlatMapDriver.run(FlatMapDriver.java:107) > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:503) > at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) > at java.lang.Thread.run(Thread.java:748) > {code} > For reference, the write code resembles this: > {code:java} > FileIO.Write write = FileIO.write() > .via(ParquetIO.sink(schema)) > .to(options.getOutputDir()). // will be something like: > s3:/// > .withSuffix(".parquet"); > records.apply(String.format("Write(%s)", options.getOutputDir()), > write);{code} > The issue does not appear to be related to ParquetIO.sink(). I am able to > reliably reproduce the issue using JSON formatted records and TextIO.sink(), > as well. Moreover, AvroIO is affected if withWindowedWrites() option is > added. > Just trying some different knobs, I went ahead and set the following option: > {code:java} > write = write.withNoSpilling();{code} > This actually seemed to fix the issue, only to have it reemerge as I scaled > up the data set size. The stack trace, while very similar, reads: > {code:java} > java.lang.IllegalArgumentException: No filesystem found for scheme s3 > at > org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.java:456) > at > org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:526) > at > org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1149) > at > org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1105) > at org.apache.beam.sdk.coders.Coder.decode(Coder.java:159) > at org.apache.beam
[jira] [Updated] (BEAM-8303) Filesystems not properly registered using FileIO.write()
[ https://issues.apache.org/jira/browse/BEAM-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Liu updated BEAM-8303: --- Fix Version/s: (was: 2.16.0) 2.17.0 > Filesystems not properly registered using FileIO.write() > > > Key: BEAM-8303 > URL: https://issues.apache.org/jira/browse/BEAM-8303 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.15.0 >Reporter: Preston Koprivica >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.17.0 > > Time Spent: 50m > Remaining Estimate: 0h > > I’m getting the following error when attempting to use the FileIO apis > (beam-2.15.0) and integrating with AWS S3. I have setup the PipelineOptions > with all the relevant AWS options, so the filesystem registry **should** be > properly seeded by the time the graph is compiled and executed: > {code:java} > java.lang.IllegalArgumentException: No filesystem found for scheme s3 > at > org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.java:456) > at > org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:526) > at > org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1149) > at > org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1105) > at org.apache.beam.sdk.coders.Coder.decode(Coder.java:159) > at > org.apache.beam.sdk.transforms.join.UnionCoder.decode(UnionCoder.java:83) > at > org.apache.beam.sdk.transforms.join.UnionCoder.decode(UnionCoder.java:32) > at > org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:543) > at > org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:534) > at > org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:480) > at > org.apache.beam.runners.flink.translation.types.CoderTypeSerializer.deserialize(CoderTypeSerializer.java:93) > at > org.apache.flink.runtime.plugable.NonReusingDeserializationDelegate.read(NonReusingDeserializationDelegate.java:55) > at > org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.getNextRecord(SpillingAdaptiveSpanningRecordDeserializer.java:106) > at > org.apache.flink.runtime.io.network.api.reader.AbstractRecordReader.getNextRecord(AbstractRecordReader.java:72) > at > org.apache.flink.runtime.io.network.api.reader.MutableRecordReader.next(MutableRecordReader.java:47) > at > org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:73) > at > org.apache.flink.runtime.operators.FlatMapDriver.run(FlatMapDriver.java:107) > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:503) > at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) > at java.lang.Thread.run(Thread.java:748) > {code} > For reference, the write code resembles this: > {code:java} > FileIO.Write write = FileIO.write() > .via(ParquetIO.sink(schema)) > .to(options.getOutputDir()). // will be something like: > s3:/// > .withSuffix(".parquet"); > records.apply(String.format("Write(%s)", options.getOutputDir()), > write);{code} > The issue does not appear to be related to ParquetIO.sink(). I am able to > reliably reproduce the issue using JSON formatted records and TextIO.sink(), > as well. Moreover, AvroIO is affected if withWindowedWrites() option is > added. > Just trying some different knobs, I went ahead and set the following option: > {code:java} > write = write.withNoSpilling();{code} > This actually seemed to fix the issue, only to have it reemerge as I scaled > up the data set size. The stack trace, while very similar, reads: > {code:java} > java.lang.IllegalArgumentException: No filesystem found for scheme s3 > at > org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.java:456) > at > org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:526) > at > org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1149) > at > org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1105) > at org.apache.beam.sdk.coders.Coder.decode(Coder.java:159) > at org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:82) > at org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:36) > at > org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:543) > at > org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:534) > at
[jira] [Created] (BEAM-8331) Vendored calcite breaks if another calcite is on the class path
Andrew Pilloud created BEAM-8331: Summary: Vendored calcite breaks if another calcite is on the class path Key: BEAM-8331 URL: https://issues.apache.org/jira/browse/BEAM-8331 Project: Beam Issue Type: Bug Components: dsl-sql Affects Versions: 2.15.0 Reporter: Andrew Pilloud If the beam vendored calcite and a non-vendored calcite are both on the classpath, neither version works. This is because the non-JDBC calcite path uses JDBC as a easy way to perform reflection. (This affects the non-JDBC version of calcite.) We need to rewrite the calcite JDBC urls as part of our vendoring (for example 'jdbc:calcite:' to 'jdbc:beam-vendor-calcite:'). Example of where this happens: [https://github.com/apache/calcite/blob/0cce229903a845a7b8ed36cf86d6078fd82d73d3/core/src/main/java/org/apache/calcite/tools/Frameworks.java#L175] {code:java} java.lang.RuntimeException: java.lang.RuntimeException: Property 'org.apache.beam.sdk.extensions.sql.impl.planner.BeamRelDataTypeSystem' not valid for plugin type org.apache.calcite.rel.type.RelDataTypeSystem at org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:160) at org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:115) at org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLPlannerImpl.(ZetaSQLPlannerImpl.java:86) at org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner.(ZetaSQLQueryPlanner.java:55){code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8312) Flink portable pipeline jars do not need to stage artifacts remotely
[ https://issues.apache.org/jira/browse/BEAM-8312?focusedWorklogId=320968&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320968 ] ASF GitHub Bot logged work on BEAM-8312: Author: ASF GitHub Bot Created on: 30/Sep/19 23:07 Start Date: 30/Sep/19 23:07 Worklog Time Spent: 10m Work Description: ibzib commented on pull request #9681: [BEAM-8312] ArtifactRetrievalService serving artifacts from jar. URL: https://github.com/apache/beam/pull/9681#discussion_r329823135 ## File path: runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/ClassLoaderArtifactRetrievalService.java ## @@ -0,0 +1,58 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.fnexecution.artifact; + +import java.io.IOException; +import java.io.InputStream; + +/** + * An {@link ArtifactRetrievalService} that loads artifacts as {@link ClassLoader} resources. + * + * The retrieval token should be a path to a JSON-formatted ProxyManifest accessible vai {@link + * ClassLoader#getResource(String)} whose resource locations also point to paths loadable via {@link + * ClassLoader#getResource(String)}. + */ +public class ClassLoaderArtifactRetrievalService extends AbstractArtifactRetrievalService { + + private final ClassLoader classLoader; + + public ClassLoaderArtifactRetrievalService() { +this(ClassLoaderArtifactRetrievalService.class.getClassLoader()); + } + + public ClassLoaderArtifactRetrievalService(ClassLoader classLoader) { +this.classLoader = classLoader; + } + + @Override + public InputStream openManifest(String retrievalToken) throws IOException { +return openUri(retrievalToken, retrievalToken); + } + + @Override + public InputStream openUri(String retrievalToken, String uri) throws IOException { +if (uri.charAt(0) == '/') { Review comment: I always had to use absolute paths when loading from the classpath in a jar. Although I guess that might depend on which class we're loading from? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320968) Time Spent: 1.5h (was: 1h 20m) > Flink portable pipeline jars do not need to stage artifacts remotely > > > Key: BEAM-8312 > URL: https://issues.apache.org/jira/browse/BEAM-8312 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Labels: portability-flink > Time Spent: 1.5h > Remaining Estimate: 0h > > Currently, Flink job jars re-stage all artifacts at runtime (on the > JobManager) by using the usual BeamFileSystemArtifactRetrievalService [1]. > However, since the manifest and all the artifacts live on the classpath of > the jar, and everything from the classpath is copied to the Flink workers > anyway [2], it should not be necessary to do additional artifact staging. We > could replace BeamFileSystemArtifactRetrievalService in this case with a > simple ArtifactRetrievalService that just pulls the artifacts from the > classpath. > > [1] > [https://github.com/apache/beam/blob/340c3202b1e5824b959f5f9f626e4c7c7842a3cb/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactRetrievalService.java] > [2] > [https://github.com/apache/beam/blob/2f1b56ccc506054e40afe4793a8b556e872e1865/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkExecutionEnvironments.java#L93] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8312) Flink portable pipeline jars do not need to stage artifacts remotely
[ https://issues.apache.org/jira/browse/BEAM-8312?focusedWorklogId=320964&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320964 ] ASF GitHub Bot logged work on BEAM-8312: Author: ASF GitHub Bot Created on: 30/Sep/19 23:07 Start Date: 30/Sep/19 23:07 Worklog Time Spent: 10m Work Description: ibzib commented on pull request #9681: [BEAM-8312] ArtifactRetrievalService serving artifacts from jar. URL: https://github.com/apache/beam/pull/9681#discussion_r329821776 ## File path: runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/ClassLoaderArtifactRetrievalService.java ## @@ -0,0 +1,58 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.fnexecution.artifact; + +import java.io.IOException; +import java.io.InputStream; + +/** + * An {@link ArtifactRetrievalService} that loads artifacts as {@link ClassLoader} resources. + * + * The retrieval token should be a path to a JSON-formatted ProxyManifest accessible vai {@link Review comment: via This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320964) Time Spent: 1h 10m (was: 1h) > Flink portable pipeline jars do not need to stage artifacts remotely > > > Key: BEAM-8312 > URL: https://issues.apache.org/jira/browse/BEAM-8312 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Labels: portability-flink > Time Spent: 1h 10m > Remaining Estimate: 0h > > Currently, Flink job jars re-stage all artifacts at runtime (on the > JobManager) by using the usual BeamFileSystemArtifactRetrievalService [1]. > However, since the manifest and all the artifacts live on the classpath of > the jar, and everything from the classpath is copied to the Flink workers > anyway [2], it should not be necessary to do additional artifact staging. We > could replace BeamFileSystemArtifactRetrievalService in this case with a > simple ArtifactRetrievalService that just pulls the artifacts from the > classpath. > > [1] > [https://github.com/apache/beam/blob/340c3202b1e5824b959f5f9f626e4c7c7842a3cb/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactRetrievalService.java] > [2] > [https://github.com/apache/beam/blob/2f1b56ccc506054e40afe4793a8b556e872e1865/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkExecutionEnvironments.java#L93] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8312) Flink portable pipeline jars do not need to stage artifacts remotely
[ https://issues.apache.org/jira/browse/BEAM-8312?focusedWorklogId=320967&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320967 ] ASF GitHub Bot logged work on BEAM-8312: Author: ASF GitHub Bot Created on: 30/Sep/19 23:07 Start Date: 30/Sep/19 23:07 Worklog Time Spent: 10m Work Description: ibzib commented on pull request #9681: [BEAM-8312] ArtifactRetrievalService serving artifacts from jar. URL: https://github.com/apache/beam/pull/9681#discussion_r329820112 ## File path: runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/artifact/ClassLoaderArtifactServiceTest.java ## @@ -0,0 +1,401 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.fnexecution.artifact; + +import com.google.common.base.Charsets; +import com.google.common.collect.ImmutableMap; +import java.io.ByteArrayOutputStream; +import java.io.FileOutputStream; +import java.io.IOException; +import java.net.URI; +import java.net.URL; +import java.net.URLClassLoader; +import java.nio.charset.Charset; +import java.nio.file.FileSystem; +import java.nio.file.FileSystems; +import java.nio.file.Path; +import java.nio.file.Paths; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; +import java.util.Map; +import java.util.concurrent.CompletableFuture; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.TimeoutException; +import java.util.zip.ZipEntry; +import java.util.zip.ZipOutputStream; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi; +import org.apache.beam.model.jobmanagement.v1.ArtifactRetrievalServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc; +import org.apache.beam.runners.fnexecution.GrpcFnServer; +import org.apache.beam.runners.fnexecution.InProcessServerFactory; +import org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.ByteString; +import org.apache.beam.vendor.grpc.v1p21p0.io.grpc.ManagedChannel; +import org.apache.beam.vendor.grpc.v1p21p0.io.grpc.inprocess.InProcessChannelBuilder; +import org.apache.beam.vendor.grpc.v1p21p0.io.grpc.stub.StreamObserver; +import org.junit.Assert; +import org.junit.Rule; +import org.junit.Test; +import org.junit.rules.TemporaryFolder; +import org.junit.runner.RunWith; +import org.junit.runners.JUnit4; + +/** + * Tests for {@link ClassLoaderArtifactRetrievalService} and {@link + * JavaFilesystemArtifactStagingService}. + */ +@RunWith(JUnit4.class) +public class ClassLoaderArtifactServiceTest { + + @Rule public TemporaryFolder tempFolder = new TemporaryFolder(); + + private static final int ARTIFACT_CHUNK_SIZE = 100; + + private static final Charset BIJECTIVE_CHARSET = Charsets.ISO_8859_1; + + public interface ArtifactServicePair extends AutoCloseable { + +public String getStagingToken(String nonce); + +public ArtifactStagingServiceGrpc.ArtifactStagingServiceStub createStagingStub() +throws Exception; + +public ArtifactStagingServiceGrpc.ArtifactStagingServiceBlockingStub createStagingBlockingStub() +throws Exception; + +public ArtifactRetrievalServiceGrpc.ArtifactRetrievalServiceStub createRetrievalStub() +throws Exception; + +public ArtifactRetrievalServiceGrpc.ArtifactRetrievalServiceBlockingStub +createRetrievalBlockingStub() throws Exception; + } + + private ArtifactServicePair classLoaderService() throws IOException { Review comment: Since this is a sizable chunk of code for a test, it might be good to provide a comment explaining what this is for (ie emulating a `JarCreator`) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320967) Time Spent: 1h 20m (was: 1h 10m) > Flink portable pipe
[jira] [Work logged] (BEAM-8312) Flink portable pipeline jars do not need to stage artifacts remotely
[ https://issues.apache.org/jira/browse/BEAM-8312?focusedWorklogId=320965&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320965 ] ASF GitHub Bot logged work on BEAM-8312: Author: ASF GitHub Bot Created on: 30/Sep/19 23:07 Start Date: 30/Sep/19 23:07 Worklog Time Spent: 10m Work Description: ibzib commented on pull request #9681: [BEAM-8312] ArtifactRetrievalService serving artifacts from jar. URL: https://github.com/apache/beam/pull/9681#discussion_r329814570 ## File path: runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/JavaFilesystemArtifactStagingService.java ## @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.fnexecution.artifact; + +import java.io.File; +import java.io.IOException; +import java.nio.channels.Channels; +import java.nio.channels.WritableByteChannel; +import java.nio.file.FileSystem; +import java.nio.file.Files; +import java.nio.file.Path; +import java.util.Comparator; +import java.util.stream.Stream; +import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc; + +/** + * An {@link ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase} that loads artifacts into a + * Java FileSystem. Review comment: Could you `@link` the FileSystem class as well? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320965) Time Spent: 1h 10m (was: 1h) > Flink portable pipeline jars do not need to stage artifacts remotely > > > Key: BEAM-8312 > URL: https://issues.apache.org/jira/browse/BEAM-8312 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Labels: portability-flink > Time Spent: 1h 10m > Remaining Estimate: 0h > > Currently, Flink job jars re-stage all artifacts at runtime (on the > JobManager) by using the usual BeamFileSystemArtifactRetrievalService [1]. > However, since the manifest and all the artifacts live on the classpath of > the jar, and everything from the classpath is copied to the Flink workers > anyway [2], it should not be necessary to do additional artifact staging. We > could replace BeamFileSystemArtifactRetrievalService in this case with a > simple ArtifactRetrievalService that just pulls the artifacts from the > classpath. > > [1] > [https://github.com/apache/beam/blob/340c3202b1e5824b959f5f9f626e4c7c7842a3cb/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactRetrievalService.java] > [2] > [https://github.com/apache/beam/blob/2f1b56ccc506054e40afe4793a8b556e872e1865/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkExecutionEnvironments.java#L93] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8312) Flink portable pipeline jars do not need to stage artifacts remotely
[ https://issues.apache.org/jira/browse/BEAM-8312?focusedWorklogId=320966&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320966 ] ASF GitHub Bot logged work on BEAM-8312: Author: ASF GitHub Bot Created on: 30/Sep/19 23:07 Start Date: 30/Sep/19 23:07 Worklog Time Spent: 10m Work Description: ibzib commented on pull request #9681: [BEAM-8312] ArtifactRetrievalService serving artifacts from jar. URL: https://github.com/apache/beam/pull/9681#discussion_r329814389 ## File path: runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/JavaFilesystemArtifactStagingService.java ## @@ -0,0 +1,92 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.fnexecution.artifact; + +import java.io.File; +import java.io.IOException; +import java.nio.channels.Channels; +import java.nio.channels.WritableByteChannel; +import java.nio.file.FileSystem; +import java.nio.file.Files; +import java.nio.file.Path; +import java.util.Comparator; +import java.util.stream.Stream; +import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc; + +/** + * An {@link ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase} that loads artifacts into a + * Java FileSystem. + */ +public class JavaFilesystemArtifactStagingService extends AbstractArtifactStagingService { Review comment: SGTM, I agree it's nice to decouple this logic from `PortablePipelineJarCreator`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320966) Time Spent: 1h 10m (was: 1h) > Flink portable pipeline jars do not need to stage artifacts remotely > > > Key: BEAM-8312 > URL: https://issues.apache.org/jira/browse/BEAM-8312 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Labels: portability-flink > Time Spent: 1h 10m > Remaining Estimate: 0h > > Currently, Flink job jars re-stage all artifacts at runtime (on the > JobManager) by using the usual BeamFileSystemArtifactRetrievalService [1]. > However, since the manifest and all the artifacts live on the classpath of > the jar, and everything from the classpath is copied to the Flink workers > anyway [2], it should not be necessary to do additional artifact staging. We > could replace BeamFileSystemArtifactRetrievalService in this case with a > simple ArtifactRetrievalService that just pulls the artifacts from the > classpath. > > [1] > [https://github.com/apache/beam/blob/340c3202b1e5824b959f5f9f626e4c7c7842a3cb/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactRetrievalService.java] > [2] > [https://github.com/apache/beam/blob/2f1b56ccc506054e40afe4793a8b556e872e1865/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkExecutionEnvironments.java#L93] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8303) Filesystems not properly registered using FileIO.write()
[ https://issues.apache.org/jira/browse/BEAM-8303?focusedWorklogId=320948&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320948 ] ASF GitHub Bot logged work on BEAM-8303: Author: ASF GitHub Bot Created on: 30/Sep/19 22:48 Start Date: 30/Sep/19 22:48 Worklog Time Spent: 10m Work Description: markflyhigh commented on issue #9688: [BEAM-8303] Ensure FileSystems registration code runs in non UDF Flink operators URL: https://github.com/apache/beam/pull/9688#issuecomment-536783742 Fix to Portable_Python failure is merged in https://issues.apache.org/jira/browse/BEAM-8324. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320948) Time Spent: 50m (was: 40m) > Filesystems not properly registered using FileIO.write() > > > Key: BEAM-8303 > URL: https://issues.apache.org/jira/browse/BEAM-8303 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.15.0 >Reporter: Preston Koprivica >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.16.0 > > Time Spent: 50m > Remaining Estimate: 0h > > I’m getting the following error when attempting to use the FileIO apis > (beam-2.15.0) and integrating with AWS S3. I have setup the PipelineOptions > with all the relevant AWS options, so the filesystem registry **should** be > properly seeded by the time the graph is compiled and executed: > {code:java} > java.lang.IllegalArgumentException: No filesystem found for scheme s3 > at > org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.java:456) > at > org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:526) > at > org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1149) > at > org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1105) > at org.apache.beam.sdk.coders.Coder.decode(Coder.java:159) > at > org.apache.beam.sdk.transforms.join.UnionCoder.decode(UnionCoder.java:83) > at > org.apache.beam.sdk.transforms.join.UnionCoder.decode(UnionCoder.java:32) > at > org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:543) > at > org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:534) > at > org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:480) > at > org.apache.beam.runners.flink.translation.types.CoderTypeSerializer.deserialize(CoderTypeSerializer.java:93) > at > org.apache.flink.runtime.plugable.NonReusingDeserializationDelegate.read(NonReusingDeserializationDelegate.java:55) > at > org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.getNextRecord(SpillingAdaptiveSpanningRecordDeserializer.java:106) > at > org.apache.flink.runtime.io.network.api.reader.AbstractRecordReader.getNextRecord(AbstractRecordReader.java:72) > at > org.apache.flink.runtime.io.network.api.reader.MutableRecordReader.next(MutableRecordReader.java:47) > at > org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:73) > at > org.apache.flink.runtime.operators.FlatMapDriver.run(FlatMapDriver.java:107) > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:503) > at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) > at java.lang.Thread.run(Thread.java:748) > {code} > For reference, the write code resembles this: > {code:java} > FileIO.Write write = FileIO.write() > .via(ParquetIO.sink(schema)) > .to(options.getOutputDir()). // will be something like: > s3:/// > .withSuffix(".parquet"); > records.apply(String.format("Write(%s)", options.getOutputDir()), > write);{code} > The issue does not appear to be related to ParquetIO.sink(). I am able to > reliably reproduce the issue using JSON formatted records and TextIO.sink(), > as well. Moreover, AvroIO is affected if withWindowedWrites() option is > added. > Just trying some different knobs, I went ahead and set the following option: > {code:java} > write = write.withNoSpilling();{code} > This actually seemed to fix the issue, only to have it reemerge as I scaled > up the data set size. The stack trace, while
[jira] [Work logged] (BEAM-7933) Adding timeout to JobServer grpc calls
[ https://issues.apache.org/jira/browse/BEAM-7933?focusedWorklogId=320919&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320919 ] ASF GitHub Bot logged work on BEAM-7933: Author: ASF GitHub Bot Created on: 30/Sep/19 22:36 Start Date: 30/Sep/19 22:36 Worklog Time Spent: 10m Work Description: ecanzonieri commented on pull request #9673: [BEAM-7933] Add job server request timeout (default to 60 seconds) URL: https://github.com/apache/beam/pull/9673#discussion_r329815849 ## File path: sdks/python/apache_beam/runners/portability/portable_runner.py ## @@ -251,7 +253,11 @@ def send_options_request(max_retries=5): # This reports channel is READY but connections may fail # Seems to be only an issue on Mac with port forwardings return job_service.DescribePipelineOptions( - beam_job_api_pb2.DescribePipelineOptionsRequest()) + beam_job_api_pb2.DescribePipelineOptionsRequest(), + timeout=portable_options.job_server_timeout) +except grpc.FutureTimeoutError: + # no retry for timeout errors Review comment: My guess is that we want to retry for other errors, but we should treat timeout as unretriable, so that we never wait for longer than the specified timeout (including retries). We could implement exponential backoff so that it will overall still wait for no longer than the timeout, but I'm not sure it's worth the additional complexity. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320919) Time Spent: 1h 10m (was: 1h) > Adding timeout to JobServer grpc calls > -- > > Key: BEAM-7933 > URL: https://issues.apache.org/jira/browse/BEAM-7933 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core >Affects Versions: 2.14.0 >Reporter: Enrico Canzonieri >Assignee: Enrico Canzonieri >Priority: Minor > Labels: portability > Time Spent: 1h 10m > Remaining Estimate: 0h > > grpc calls to the JobServer from the Python SDK do not have timeouts. That > means that the call to pipeline.run()could hang forever if the JobServer is > not running (or failing to start). > E.g. > [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/portable_runner.py#L307] > the call to Prepare() doesn't provide any timeout value and the same applies > to other JobServer requests. > As part of this ticket we could add a default timeout of 60 seconds as the > default timeout for http client. > Additionally, we could consider adding a --job-server-request-timeout to the > [PortableOptions|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/options/pipeline_options.py#L805] > class to be used in the JobServer interactions inside probable_runner.py. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8312) Flink portable pipeline jars do not need to stage artifacts remotely
[ https://issues.apache.org/jira/browse/BEAM-8312?focusedWorklogId=320910&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320910 ] ASF GitHub Bot logged work on BEAM-8312: Author: ASF GitHub Bot Created on: 30/Sep/19 22:14 Start Date: 30/Sep/19 22:14 Worklog Time Spent: 10m Work Description: robertwb commented on pull request #9681: [BEAM-8312] ArtifactRetrievalService serving artifacts from jar. URL: https://github.com/apache/beam/pull/9681#discussion_r329809736 ## File path: runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/JavaFilesystemArtifactStagingService.java ## @@ -0,0 +1,92 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.fnexecution.artifact; + +import java.io.File; +import java.io.IOException; +import java.nio.channels.Channels; +import java.nio.channels.WritableByteChannel; +import java.nio.file.FileSystem; +import java.nio.file.Files; +import java.nio.file.Path; +import java.util.Comparator; +import java.util.stream.Stream; +import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc; + +/** + * An {@link ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase} that loads artifacts into a + * Java FileSystem. + */ +public class JavaFilesystemArtifactStagingService extends AbstractArtifactStagingService { Review comment: Yes, the `PortablePipelineJarCreator` does not need to use this class, but it could be useful outside that context as well, and I think it's also nice to have the full pair of staging/retrieval (even if just for testing). Fortunately it's a pretty trivial implementation, where all the methods are essentially one-liners. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320910) Time Spent: 1h (was: 50m) > Flink portable pipeline jars do not need to stage artifacts remotely > > > Key: BEAM-8312 > URL: https://issues.apache.org/jira/browse/BEAM-8312 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Labels: portability-flink > Time Spent: 1h > Remaining Estimate: 0h > > Currently, Flink job jars re-stage all artifacts at runtime (on the > JobManager) by using the usual BeamFileSystemArtifactRetrievalService [1]. > However, since the manifest and all the artifacts live on the classpath of > the jar, and everything from the classpath is copied to the Flink workers > anyway [2], it should not be necessary to do additional artifact staging. We > could replace BeamFileSystemArtifactRetrievalService in this case with a > simple ArtifactRetrievalService that just pulls the artifacts from the > classpath. > > [1] > [https://github.com/apache/beam/blob/340c3202b1e5824b959f5f9f626e4c7c7842a3cb/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactRetrievalService.java] > [2] > [https://github.com/apache/beam/blob/2f1b56ccc506054e40afe4793a8b556e872e1865/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkExecutionEnvironments.java#L93] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8312) Flink portable pipeline jars do not need to stage artifacts remotely
[ https://issues.apache.org/jira/browse/BEAM-8312?focusedWorklogId=320909&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320909 ] ASF GitHub Bot logged work on BEAM-8312: Author: ASF GitHub Bot Created on: 30/Sep/19 22:09 Start Date: 30/Sep/19 22:09 Worklog Time Spent: 10m Work Description: robertwb commented on pull request #9681: [BEAM-8312] ArtifactRetrievalService serving artifacts from jar. URL: https://github.com/apache/beam/pull/9681#discussion_r329808387 ## File path: runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/artifact/ClassLoaderArtifactServiceTest.java ## @@ -0,0 +1,412 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.fnexecution.artifact; + +import com.google.common.base.Charsets; +import com.google.common.collect.ImmutableMap; +import java.io.ByteArrayOutputStream; +import java.io.FileOutputStream; +import java.io.IOException; +import java.net.URI; +import java.net.URL; +import java.net.URLClassLoader; +import java.nio.charset.Charset; +import java.nio.file.FileSystem; +import java.nio.file.FileSystems; +import java.nio.file.Path; +import java.nio.file.Paths; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; +import java.util.Map; +import java.util.concurrent.CompletableFuture; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.TimeoutException; +import java.util.zip.ZipEntry; +import java.util.zip.ZipOutputStream; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi; +import org.apache.beam.model.jobmanagement.v1.ArtifactRetrievalServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc; +import org.apache.beam.runners.fnexecution.GrpcFnServer; +import org.apache.beam.runners.fnexecution.InProcessServerFactory; +import org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.ByteString; +import org.apache.beam.vendor.grpc.v1p21p0.io.grpc.ManagedChannel; +import org.apache.beam.vendor.grpc.v1p21p0.io.grpc.inprocess.InProcessChannelBuilder; +import org.apache.beam.vendor.grpc.v1p21p0.io.grpc.stub.StreamObserver; +import org.junit.Assert; +import org.junit.Rule; +import org.junit.Test; +import org.junit.rules.TemporaryFolder; +import org.junit.runner.RunWith; +import org.junit.runners.JUnit4; + +/** + * Tests for {@link ClassLoaderArtifactRetrievalService} and {@link + * JavaFilesystemArtifactStagingService}. + */ +@RunWith(JUnit4.class) +public class ClassLoaderArtifactServiceTest { + + @Rule public TemporaryFolder tempFolder = new TemporaryFolder(); + + private static final int ARTIFACT_CHUNK_SIZE = 100; + + private static final Charset BIJECTIVE_CHARSET = Charsets.ISO_8859_1; + + public interface ArtifactServicePair extends AutoCloseable { + +public String getStagingToken(String nonce); + +public ArtifactStagingServiceGrpc.ArtifactStagingServiceStub createStagingStub() +throws Exception; + +public ArtifactStagingServiceGrpc.ArtifactStagingServiceBlockingStub createStagingBlockingStub() +throws Exception; + +public ArtifactRetrievalServiceGrpc.ArtifactRetrievalServiceStub createRetrievalStub() +throws Exception; + +public ArtifactRetrievalServiceGrpc.ArtifactRetrievalServiceBlockingStub +createRetrievalBlockingStub() throws Exception; + } + + private ArtifactServicePair classLoaderService() throws IOException { +return new ArtifactServicePair() { + + JavaFilesystemArtifactStagingService stagingService; + GrpcFnServer stagingServer; + ClassLoaderArtifactRetrievalService retrievalService; + GrpcFnServer retrievalServer; + + ArtifactStagingServiceGrpc.ArtifactStagingServiceStub stagingStub; + ArtifactStagingServiceGrpc.ArtifactStagingServiceBlockingStub stagingBlockingStub; + ArtifactRetrievalServiceGrpc.ArtifactRetrievalServiceStub retrievalStub; + ArtifactRetrievalServiceGrpc.ArtifactRetrievalServiceBlockingStub retrievalBlockingStub; + + Path jarPath = Paths.get(tempFolder
[jira] [Work logged] (BEAM-8312) Flink portable pipeline jars do not need to stage artifacts remotely
[ https://issues.apache.org/jira/browse/BEAM-8312?focusedWorklogId=320908&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320908 ] ASF GitHub Bot logged work on BEAM-8312: Author: ASF GitHub Bot Created on: 30/Sep/19 22:07 Start Date: 30/Sep/19 22:07 Worklog Time Spent: 10m Work Description: robertwb commented on pull request #9681: [BEAM-8312] ArtifactRetrievalService serving artifacts from jar. URL: https://github.com/apache/beam/pull/9681#discussion_r329807780 ## File path: runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactRetrievalService.java ## @@ -53,129 +36,45 @@ * the artifact layout and retrieval token format produced by {@link * BeamFileSystemArtifactStagingService}. */ -public class BeamFileSystemArtifactRetrievalService -extends ArtifactRetrievalServiceGrpc.ArtifactRetrievalServiceImplBase -implements ArtifactRetrievalService { +public class BeamFileSystemArtifactRetrievalService extends AbstractArtifactRetrievalService { private static final Logger LOG = LoggerFactory.getLogger(BeamFileSystemArtifactRetrievalService.class); - private static final int ARTIFACT_CHUNK_SIZE_BYTES = 2 << 20; // 2MB + private static final Cache MANIFEST_CACHE = + CacheBuilder.newBuilder() + .expireAfterAccess(1, TimeUnit.HOURS /* arbitrary */) + .maximumSize(100 /* arbitrary */) + .build(); + + public BeamFileSystemArtifactRetrievalService() { +super(MANIFEST_CACHE); + } public static BeamFileSystemArtifactRetrievalService create() { return new BeamFileSystemArtifactRetrievalService(); } @Override - public void getManifest( - ArtifactApi.GetManifestRequest request, - StreamObserver responseObserver) { -final String token = request.getRetrievalToken(); -if (Strings.isNullOrEmpty(token)) { - throw new StatusRuntimeException( - Status.INVALID_ARGUMENT.withDescription("Empty artifact token")); -} + public InputStream openUri(String retrievalToken, String uri) throws IOException { +ResourceId artifactResourceId = FileSystems.matchNewResource(uri, false /* is directory */); Review comment: Yeah. This is trying to say "is directory = false." Saying `false /* not a directory */` would also be ambiguous, given that we use this convention elsewhere. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320908) Time Spent: 40m (was: 0.5h) > Flink portable pipeline jars do not need to stage artifacts remotely > > > Key: BEAM-8312 > URL: https://issues.apache.org/jira/browse/BEAM-8312 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Labels: portability-flink > Time Spent: 40m > Remaining Estimate: 0h > > Currently, Flink job jars re-stage all artifacts at runtime (on the > JobManager) by using the usual BeamFileSystemArtifactRetrievalService [1]. > However, since the manifest and all the artifacts live on the classpath of > the jar, and everything from the classpath is copied to the Flink workers > anyway [2], it should not be necessary to do additional artifact staging. We > could replace BeamFileSystemArtifactRetrievalService in this case with a > simple ArtifactRetrievalService that just pulls the artifacts from the > classpath. > > [1] > [https://github.com/apache/beam/blob/340c3202b1e5824b959f5f9f626e4c7c7842a3cb/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactRetrievalService.java] > [2] > [https://github.com/apache/beam/blob/2f1b56ccc506054e40afe4793a8b556e872e1865/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkExecutionEnvironments.java#L93] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7223) Add ValidatesRunner test suite for Flink on Python 3.
[ https://issues.apache.org/jira/browse/BEAM-7223?focusedWorklogId=320907&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320907 ] ASF GitHub Bot logged work on BEAM-7223: Author: ASF GitHub Bot Created on: 30/Sep/19 22:04 Start Date: 30/Sep/19 22:04 Worklog Time Spent: 10m Work Description: tvalentyn commented on issue #9691: [BEAM-7223] Add Python 3.5 Flink ValidatesRunner postcommit suite. URL: https://github.com/apache/beam/pull/9691#issuecomment-536771823 Run Python 3.5 Flink ValidatesRunner This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320907) Time Spent: 4h (was: 3h 50m) > Add ValidatesRunner test suite for Flink on Python 3. > - > > Key: BEAM-7223 > URL: https://issues.apache.org/jira/browse/BEAM-7223 > Project: Beam > Issue Type: Sub-task > Components: runner-flink >Reporter: Ankur Goenka >Assignee: Valentyn Tymofieiev >Priority: Major > Fix For: 2.17.0 > > Time Spent: 4h > Remaining Estimate: 0h > > Add py3 integration tests for Flink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8021) Add Automatic-Module-Name headers for Beam Java modules
[ https://issues.apache.org/jira/browse/BEAM-8021?focusedWorklogId=320905&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320905 ] ASF GitHub Bot logged work on BEAM-8021: Author: ASF GitHub Bot Created on: 30/Sep/19 22:00 Start Date: 30/Sep/19 22:00 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #9698: [BEAM-8021] Swap build-tools to be compile only so it isn't a "required" dependency of Flink. URL: https://github.com/apache/beam/pull/9698#issuecomment-536764603 R: @lgajowy @iemejia @aaltay @mxm This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320905) Time Spent: 9.5h (was: 9h 20m) > Add Automatic-Module-Name headers for Beam Java modules > > > Key: BEAM-8021 > URL: https://issues.apache.org/jira/browse/BEAM-8021 > Project: Beam > Issue Type: Sub-task > Components: build-system >Reporter: Ismaël Mejía >Assignee: Lukasz Gajowy >Priority: Minor > Fix For: 2.17.0 > > Time Spent: 9.5h > Remaining Estimate: 0h > > For compatibility with the Java Platform Module System (JPMS) in Java 9 and > later, every JAR should have a module name, even if the library does not > itself use modules. As [suggested in the mailing > list|https://lists.apache.org/thread.html/956065580ce049481e756482dc3ccfdc994fef3b8cdb37cab3e2d9b1@%3Cdev.beam.apache.org%3E], > this is a simple change that we can do and still be backwards compatible. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7933) Adding timeout to JobServer grpc calls
[ https://issues.apache.org/jira/browse/BEAM-7933?focusedWorklogId=320900&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320900 ] ASF GitHub Bot logged work on BEAM-7933: Author: ASF GitHub Bot Created on: 30/Sep/19 21:53 Start Date: 30/Sep/19 21:53 Worklog Time Spent: 10m Work Description: ecanzonieri commented on pull request #9673: [BEAM-7933] Add job server request timeout (default to 60 seconds) URL: https://github.com/apache/beam/pull/9673#discussion_r329803384 ## File path: sdks/python/apache_beam/runners/portability/portable_runner.py ## @@ -251,7 +253,11 @@ def send_options_request(max_retries=5): # This reports channel is READY but connections may fail # Seems to be only an issue on Mac with port forwardings return job_service.DescribePipelineOptions( - beam_job_api_pb2.DescribePipelineOptionsRequest()) + beam_job_api_pb2.DescribePipelineOptionsRequest(), + timeout=portable_options.job_server_timeout) Review comment: My guess is that this is invoking the LocalJobServicer that doesn't have the timeout argument. I'll try to repro locally to verify. If that's the case I'd like to add the timeout argument to all LocalJobServicer methods. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320900) Time Spent: 1h (was: 50m) > Adding timeout to JobServer grpc calls > -- > > Key: BEAM-7933 > URL: https://issues.apache.org/jira/browse/BEAM-7933 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core >Affects Versions: 2.14.0 >Reporter: Enrico Canzonieri >Assignee: Enrico Canzonieri >Priority: Minor > Labels: portability > Time Spent: 1h > Remaining Estimate: 0h > > grpc calls to the JobServer from the Python SDK do not have timeouts. That > means that the call to pipeline.run()could hang forever if the JobServer is > not running (or failing to start). > E.g. > [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/portable_runner.py#L307] > the call to Prepare() doesn't provide any timeout value and the same applies > to other JobServer requests. > As part of this ticket we could add a default timeout of 60 seconds as the > default timeout for http client. > Additionally, we could consider adding a --job-server-request-timeout to the > [PortableOptions|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/options/pipeline_options.py#L805] > class to be used in the JobServer interactions inside probable_runner.py. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7223) Add ValidatesRunner test suite for Flink on Python 3.
[ https://issues.apache.org/jira/browse/BEAM-7223?focusedWorklogId=320899&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320899 ] ASF GitHub Bot logged work on BEAM-7223: Author: ASF GitHub Bot Created on: 30/Sep/19 21:51 Start Date: 30/Sep/19 21:51 Worklog Time Spent: 10m Work Description: tvalentyn commented on issue #9691: [BEAM-7223] Add Python 3.5 Flink ValidatesRunner postcommit suite. URL: https://github.com/apache/beam/pull/9691#issuecomment-536768069 Run Seed Job This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320899) Time Spent: 3h 50m (was: 3h 40m) > Add ValidatesRunner test suite for Flink on Python 3. > - > > Key: BEAM-7223 > URL: https://issues.apache.org/jira/browse/BEAM-7223 > Project: Beam > Issue Type: Sub-task > Components: runner-flink >Reporter: Ankur Goenka >Assignee: Valentyn Tymofieiev >Priority: Major > Fix For: 2.17.0 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > Add py3 integration tests for Flink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-5559) Beam Dependency Update Request: com.google.guava:guava
[ https://issues.apache.org/jira/browse/BEAM-5559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941360#comment-16941360 ] Luke Cwik commented on BEAM-5559: - Updating guava may be difficult because of the backwards incompatible changes that were done to guava and the large number of dependencies of Beam that rely on it. > Beam Dependency Update Request: com.google.guava:guava > -- > > Key: BEAM-5559 > URL: https://issues.apache.org/jira/browse/BEAM-5559 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Priority: Major > Fix For: 2.15.0 > > > - 2018-10-01 19:30:53.471497 > - > Please consider upgrading the dependency com.google.guava:guava. > The current version is 20.0. The latest version is 26.0-jre > cc: > Please refer to [Beam Dependency Guide > |https://beam.apache.org/contribute/dependencies/]for more information. > Do Not Modify The Description Above. > - 2018-10-08 12:18:05.174889 > - > Please consider upgrading the dependency com.google.guava:guava. > The current version is 20.0. The latest version is 26.0-jre > cc: > Please refer to [Beam Dependency Guide > |https://beam.apache.org/contribute/dependencies/]for more information. > Do Not Modify The Description Above. > - 2019-04-15 12:32:27.737694 > - > Please consider upgrading the dependency com.google.guava:guava. > The current version is 20.0. The latest version is 27.1-jre > cc: > Please refer to [Beam Dependency Guide > |https://beam.apache.org/contribute/dependencies/]for more information. > Do Not Modify The Description Above. > - 2019-04-22 12:10:18.539470 > - > Please consider upgrading the dependency com.google.guava:guava. > The current version is 20.0. The latest version is 27.1-jre > cc: > Please refer to [Beam Dependency Guide > |https://beam.apache.org/contribute/dependencies/]for more information. > Do Not Modify The Description Above. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-8330) PubSubIO.readAvros should produce a schema'd PCollection if clazz has a schema
Brian Hulette created BEAM-8330: --- Summary: PubSubIO.readAvros should produce a schema'd PCollection if clazz has a schema Key: BEAM-8330 URL: https://issues.apache.org/jira/browse/BEAM-8330 Project: Beam Issue Type: Improvement Components: io-java-gcp Affects Versions: 2.16.0 Reporter: Brian Hulette Assignee: Brian Hulette Fix For: 2.17.0 Currently {{PubsubIO.readAvros(clazz)}} *always* yields a PCollection with an AvroCoder. This should only be a fallback in the event that no coder can be inferred. That way if we can infer a schema for `clazz` we will produce a PCollection with a schema. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-5559) Beam Dependency Update Request: com.google.guava:guava
[ https://issues.apache.org/jira/browse/BEAM-5559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-5559: Status: Open (was: Triage Needed) > Beam Dependency Update Request: com.google.guava:guava > -- > > Key: BEAM-5559 > URL: https://issues.apache.org/jira/browse/BEAM-5559 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Priority: Major > Fix For: 2.15.0 > > > - 2018-10-01 19:30:53.471497 > - > Please consider upgrading the dependency com.google.guava:guava. > The current version is 20.0. The latest version is 26.0-jre > cc: > Please refer to [Beam Dependency Guide > |https://beam.apache.org/contribute/dependencies/]for more information. > Do Not Modify The Description Above. > - 2018-10-08 12:18:05.174889 > - > Please consider upgrading the dependency com.google.guava:guava. > The current version is 20.0. The latest version is 26.0-jre > cc: > Please refer to [Beam Dependency Guide > |https://beam.apache.org/contribute/dependencies/]for more information. > Do Not Modify The Description Above. > - 2019-04-15 12:32:27.737694 > - > Please consider upgrading the dependency com.google.guava:guava. > The current version is 20.0. The latest version is 27.1-jre > cc: > Please refer to [Beam Dependency Guide > |https://beam.apache.org/contribute/dependencies/]for more information. > Do Not Modify The Description Above. > - 2019-04-22 12:10:18.539470 > - > Please consider upgrading the dependency com.google.guava:guava. > The current version is 20.0. The latest version is 27.1-jre > cc: > Please refer to [Beam Dependency Guide > |https://beam.apache.org/contribute/dependencies/]for more information. > Do Not Modify The Description Above. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-5085) Beam Dependency Update Request: com.google.guava:guava 26.0-jre
[ https://issues.apache.org/jira/browse/BEAM-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941356#comment-16941356 ] Luke Cwik commented on BEAM-5085: - I closed this issue because the dependency management tool that cuts these bugs now just cuts one till it is upgraded which is https://issues.apache.org/jira/browse/BEAM-5559 (I accidentally closed that one as well when trying to perform some Jira cleanup.) > Beam Dependency Update Request: com.google.guava:guava 26.0-jre > --- > > Key: BEAM-5085 > URL: https://issues.apache.org/jira/browse/BEAM-5085 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Luke Cwik >Priority: Major > Fix For: Not applicable > > > 2018-08-06 12:11:45.524503 > Please review and upgrade the com.google.guava:guava to the latest > version 26.0-jre > > cc: > 2018-08-13 12:13:13.070372 > Please review and upgrade the com.google.guava:guava to the latest > version 26.0-jre > > cc: > 2018-08-20 12:13:53.326708 > Please review and upgrade the com.google.guava:guava to the latest > version 26.0-jre > > cc: > 2018-08-27 12:14:56.293907 > Please review and upgrade the com.google.guava:guava to the latest > version 26.0-jre > > cc: > 2018-09-03 12:27:58.181965 > Please review and upgrade the com.google.guava:guava to the latest > version 26.0-jre > > cc: > 2018-09-10 12:16:59.770320 > Please review and upgrade the com.google.guava:guava to the latest > version 26.0-jre > > cc: > 2018-09-17 12:19:31.661114 > Please review and upgrade the com.google.guava:guava to the latest > version 26.0-jre > > cc: > 2018-09-24 12:26:47.172437 > Please review and upgrade the com.google.guava:guava to the latest > version 26.0-jre > > cc: -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (BEAM-5559) Beam Dependency Update Request: com.google.guava:guava
[ https://issues.apache.org/jira/browse/BEAM-5559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik reopened BEAM-5559: - Assignee: (was: Luke Cwik) > Beam Dependency Update Request: com.google.guava:guava > -- > > Key: BEAM-5559 > URL: https://issues.apache.org/jira/browse/BEAM-5559 > Project: Beam > Issue Type: Sub-task > Components: dependencies >Reporter: Beam JIRA Bot >Priority: Major > Fix For: 2.15.0 > > > - 2018-10-01 19:30:53.471497 > - > Please consider upgrading the dependency com.google.guava:guava. > The current version is 20.0. The latest version is 26.0-jre > cc: > Please refer to [Beam Dependency Guide > |https://beam.apache.org/contribute/dependencies/]for more information. > Do Not Modify The Description Above. > - 2018-10-08 12:18:05.174889 > - > Please consider upgrading the dependency com.google.guava:guava. > The current version is 20.0. The latest version is 26.0-jre > cc: > Please refer to [Beam Dependency Guide > |https://beam.apache.org/contribute/dependencies/]for more information. > Do Not Modify The Description Above. > - 2019-04-15 12:32:27.737694 > - > Please consider upgrading the dependency com.google.guava:guava. > The current version is 20.0. The latest version is 27.1-jre > cc: > Please refer to [Beam Dependency Guide > |https://beam.apache.org/contribute/dependencies/]for more information. > Do Not Modify The Description Above. > - 2019-04-22 12:10:18.539470 > - > Please consider upgrading the dependency com.google.guava:guava. > The current version is 20.0. The latest version is 27.1-jre > cc: > Please refer to [Beam Dependency Guide > |https://beam.apache.org/contribute/dependencies/]for more information. > Do Not Modify The Description Above. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-8329) AvroCoder ReflectData doesn't use TimestampConversion
Brian Hulette created BEAM-8329: --- Summary: AvroCoder ReflectData doesn't use TimestampConversion Key: BEAM-8329 URL: https://issues.apache.org/jira/browse/BEAM-8329 Project: Beam Issue Type: Bug Components: sdk-java-core Affects Versions: 2.16.0 Reporter: Brian Hulette Assignee: Brian Hulette Fix For: 2.17.0 The ReflectData created by AvroCoder doesn't have {{TimeConversions.TimestampConversion}} registered which prevents it from working with millis-instant joda DateTime instances. AvroUtils does this statically [here|https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/AvroUtils.java#L78]. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-8021) Add Automatic-Module-Name headers for Beam Java modules
[ https://issues.apache.org/jira/browse/BEAM-8021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941351#comment-16941351 ] Luke Cwik commented on BEAM-8021: - I closed Lukasz Gajowy's PR and migrated Flink runner to not export the build-tools artifact as a required dependency in [https://github.com/apache/beam/pull/9698] > Add Automatic-Module-Name headers for Beam Java modules > > > Key: BEAM-8021 > URL: https://issues.apache.org/jira/browse/BEAM-8021 > Project: Beam > Issue Type: Sub-task > Components: build-system >Reporter: Ismaël Mejía >Assignee: Lukasz Gajowy >Priority: Minor > Fix For: 2.17.0 > > Time Spent: 9h 20m > Remaining Estimate: 0h > > For compatibility with the Java Platform Module System (JPMS) in Java 9 and > later, every JAR should have a module name, even if the library does not > itself use modules. As [suggested in the mailing > list|https://lists.apache.org/thread.html/956065580ce049481e756482dc3ccfdc994fef3b8cdb37cab3e2d9b1@%3Cdev.beam.apache.org%3E], > this is a simple change that we can do and still be backwards compatible. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-6896) Beam Dependency Update Request: PyYAML
[ https://issues.apache.org/jira/browse/BEAM-6896?focusedWorklogId=320893&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320893 ] ASF GitHub Bot logged work on BEAM-6896: Author: ASF GitHub Bot Created on: 30/Sep/19 21:40 Start Date: 30/Sep/19 21:40 Worklog Time Spent: 10m Work Description: robertwb commented on pull request #9699: [BEAM-6896] Loosen PyYAML dependency. URL: https://github.com/apache/beam/pull/9699 Also update this to be a test-only dependency, as it's only used for tests. There were no relevant API changes with the last major release. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Pyth
[jira] [Work logged] (BEAM-8021) Add Automatic-Module-Name headers for Beam Java modules
[ https://issues.apache.org/jira/browse/BEAM-8021?focusedWorklogId=320892&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320892 ] ASF GitHub Bot logged work on BEAM-8021: Author: ASF GitHub Bot Created on: 30/Sep/19 21:40 Start Date: 30/Sep/19 21:40 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #9698: [BEAM-8021] Swap build-tools to be compile only so it isn't a "required" dependency of Flink. URL: https://github.com/apache/beam/pull/9698#issuecomment-536764603 @lgajowy @iemejia @aaltay This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320892) Time Spent: 9h 10m (was: 9h) > Add Automatic-Module-Name headers for Beam Java modules > > > Key: BEAM-8021 > URL: https://issues.apache.org/jira/browse/BEAM-8021 > Project: Beam > Issue Type: Sub-task > Components: build-system >Reporter: Ismaël Mejía >Assignee: Lukasz Gajowy >Priority: Minor > Fix For: 2.17.0 > > Time Spent: 9h 10m > Remaining Estimate: 0h > > For compatibility with the Java Platform Module System (JPMS) in Java 9 and > later, every JAR should have a module name, even if the library does not > itself use modules. As [suggested in the mailing > list|https://lists.apache.org/thread.html/956065580ce049481e756482dc3ccfdc994fef3b8cdb37cab3e2d9b1@%3Cdev.beam.apache.org%3E], > this is a simple change that we can do and still be backwards compatible. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-6896) Beam Dependency Update Request: PyYAML
[ https://issues.apache.org/jira/browse/BEAM-6896?focusedWorklogId=320895&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320895 ] ASF GitHub Bot logged work on BEAM-6896: Author: ASF GitHub Bot Created on: 30/Sep/19 21:40 Start Date: 30/Sep/19 21:40 Worklog Time Spent: 10m Work Description: robertwb commented on issue #9699: [BEAM-6896] Loosen PyYAML dependency. URL: https://github.com/apache/beam/pull/9699#issuecomment-536764669 R: @yifanzou This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320895) Time Spent: 20m (was: 10m) > Beam Dependency Update Request: PyYAML > -- > > Key: BEAM-6896 > URL: https://issues.apache.org/jira/browse/BEAM-6896 > Project: Beam > Issue Type: Bug > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Robert Bradshaw >Priority: Major > Fix For: 2.17.0 > > Time Spent: 20m > Remaining Estimate: 0h > > - 2019-03-25 04:17:47.501359 > - > Please consider upgrading the dependency PyYAML. > The current version is 3.13. The latest version is 5.1 > cc: > Please refer to [Beam Dependency Guide > |https://beam.apache.org/contribute/dependencies/]for more information. > Do Not Modify The Description Above. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8021) Add Automatic-Module-Name headers for Beam Java modules
[ https://issues.apache.org/jira/browse/BEAM-8021?focusedWorklogId=320894&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320894 ] ASF GitHub Bot logged work on BEAM-8021: Author: ASF GitHub Bot Created on: 30/Sep/19 21:40 Start Date: 30/Sep/19 21:40 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #9698: [BEAM-8021] Swap build-tools to be compile only so it isn't a "required" dependency of Flink. URL: https://github.com/apache/beam/pull/9698#issuecomment-536764603 R: @lgajowy @iemejia @aaltay This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320894) Time Spent: 9h 20m (was: 9h 10m) > Add Automatic-Module-Name headers for Beam Java modules > > > Key: BEAM-8021 > URL: https://issues.apache.org/jira/browse/BEAM-8021 > Project: Beam > Issue Type: Sub-task > Components: build-system >Reporter: Ismaël Mejía >Assignee: Lukasz Gajowy >Priority: Minor > Fix For: 2.17.0 > > Time Spent: 9h 20m > Remaining Estimate: 0h > > For compatibility with the Java Platform Module System (JPMS) in Java 9 and > later, every JAR should have a module name, even if the library does not > itself use modules. As [suggested in the mailing > list|https://lists.apache.org/thread.html/956065580ce049481e756482dc3ccfdc994fef3b8cdb37cab3e2d9b1@%3Cdev.beam.apache.org%3E], > this is a simple change that we can do and still be backwards compatible. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8021) Add Automatic-Module-Name headers for Beam Java modules
[ https://issues.apache.org/jira/browse/BEAM-8021?focusedWorklogId=320890&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320890 ] ASF GitHub Bot logged work on BEAM-8021: Author: ASF GitHub Bot Created on: 30/Sep/19 21:38 Start Date: 30/Sep/19 21:38 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #9690: [BEAM-8021] Publish build-tools jar again due to flink runner usage issues URL: https://github.com/apache/beam/pull/9690 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320890) Time Spent: 9h (was: 8h 50m) > Add Automatic-Module-Name headers for Beam Java modules > > > Key: BEAM-8021 > URL: https://issues.apache.org/jira/browse/BEAM-8021 > Project: Beam > Issue Type: Sub-task > Components: build-system >Reporter: Ismaël Mejía >Assignee: Lukasz Gajowy >Priority: Minor > Fix For: 2.17.0 > > Time Spent: 9h > Remaining Estimate: 0h > > For compatibility with the Java Platform Module System (JPMS) in Java 9 and > later, every JAR should have a module name, even if the library does not > itself use modules. As [suggested in the mailing > list|https://lists.apache.org/thread.html/956065580ce049481e756482dc3ccfdc994fef3b8cdb37cab3e2d9b1@%3Cdev.beam.apache.org%3E], > this is a simple change that we can do and still be backwards compatible. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8021) Add Automatic-Module-Name headers for Beam Java modules
[ https://issues.apache.org/jira/browse/BEAM-8021?focusedWorklogId=320889&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320889 ] ASF GitHub Bot logged work on BEAM-8021: Author: ASF GitHub Bot Created on: 30/Sep/19 21:38 Start Date: 30/Sep/19 21:38 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #9690: [BEAM-8021] Publish build-tools jar again due to flink runner usage issues URL: https://github.com/apache/beam/pull/9690#issuecomment-536763886 Closing in favor of #9698 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320889) Time Spent: 8h 50m (was: 8h 40m) > Add Automatic-Module-Name headers for Beam Java modules > > > Key: BEAM-8021 > URL: https://issues.apache.org/jira/browse/BEAM-8021 > Project: Beam > Issue Type: Sub-task > Components: build-system >Reporter: Ismaël Mejía >Assignee: Lukasz Gajowy >Priority: Minor > Fix For: 2.17.0 > > Time Spent: 8h 50m > Remaining Estimate: 0h > > For compatibility with the Java Platform Module System (JPMS) in Java 9 and > later, every JAR should have a module name, even if the library does not > itself use modules. As [suggested in the mailing > list|https://lists.apache.org/thread.html/956065580ce049481e756482dc3ccfdc994fef3b8cdb37cab3e2d9b1@%3Cdev.beam.apache.org%3E], > this is a simple change that we can do and still be backwards compatible. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8021) Add Automatic-Module-Name headers for Beam Java modules
[ https://issues.apache.org/jira/browse/BEAM-8021?focusedWorklogId=320888&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320888 ] ASF GitHub Bot logged work on BEAM-8021: Author: ASF GitHub Bot Created on: 30/Sep/19 21:37 Start Date: 30/Sep/19 21:37 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #9698: [BEAM-8021] Swap build-tools to be compile only so it isn't a "required" dependency of Flink. URL: https://github.com/apache/beam/pull/9698 Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastC
[jira] [Commented] (BEAM-6896) Beam Dependency Update Request: PyYAML
[ https://issues.apache.org/jira/browse/BEAM-6896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941345#comment-16941345 ] Robert Bradshaw commented on BEAM-6896: --- This should be an easy fix. > Beam Dependency Update Request: PyYAML > -- > > Key: BEAM-6896 > URL: https://issues.apache.org/jira/browse/BEAM-6896 > Project: Beam > Issue Type: Bug > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Robert Bradshaw >Priority: Major > Fix For: 2.17.0 > > > - 2019-03-25 04:17:47.501359 > - > Please consider upgrading the dependency PyYAML. > The current version is 3.13. The latest version is 5.1 > cc: > Please refer to [Beam Dependency Guide > |https://beam.apache.org/contribute/dependencies/]for more information. > Do Not Modify The Description Above. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-5878) Support DoFns with Keyword-only arguments in Python 3.
[ https://issues.apache.org/jira/browse/BEAM-5878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Valentyn Tymofieiev updated BEAM-5878: -- Fix Version/s: (was: 2.16.0) 2.17.0 > Support DoFns with Keyword-only arguments in Python 3. > -- > > Key: BEAM-5878 > URL: https://issues.apache.org/jira/browse/BEAM-5878 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: yoshiki obata >Priority: Minor > Fix For: 2.17.0 > > Time Spent: 14h 40m > Remaining Estimate: 0h > > Python 3.0 [adds a possibility|https://www.python.org/dev/peps/pep-3102/] to > define functions with keyword-only arguments. > Currently Beam does not handle them correctly. [~ruoyu] pointed out [one > place|https://github.com/apache/beam/blob/a56ce43109c97c739fa08adca45528c41e3c925c/sdks/python/apache_beam/typehints/decorators.py#L118] > in our codebase that we should fix: in Python in 3.0 inspect.getargspec() > will fail on functions with keyword-only arguments, but a new method > [inspect.getfullargspec()|https://docs.python.org/3/library/inspect.html#inspect.getfullargspec] > supports them. > There may be implications for our (best-effort) type-hints machinery. > We should also add a Py3-only unit tests that covers DoFn's with keyword-only > arguments once Beam Python 3 tests are in a good shape. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7933) Adding timeout to JobServer grpc calls
[ https://issues.apache.org/jira/browse/BEAM-7933?focusedWorklogId=320886&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320886 ] ASF GitHub Bot logged work on BEAM-7933: Author: ASF GitHub Bot Created on: 30/Sep/19 21:29 Start Date: 30/Sep/19 21:29 Worklog Time Spent: 10m Work Description: ecanzonieri commented on pull request #9673: [BEAM-7933] Add job server request timeout (default to 60 seconds) URL: https://github.com/apache/beam/pull/9673#discussion_r329794942 ## File path: sdks/python/apache_beam/runners/portability/portable_runner.py ## @@ -141,23 +141,25 @@ def _create_environment(options): payload=(portable_options.environment_config.encode('ascii') if portable_options.environment_config else None)) - def default_job_server(self, options): + def default_job_server(self, portable_options): # TODO Provide a way to specify a container Docker URL # https://issues.apache.org/jira/browse/BEAM-6328 if not self._dockerized_job_server: self._dockerized_job_server = job_server.StopOnExitJobServer( job_server.DockerizedJobServer()) return self._dockerized_job_server - def create_job_service(self, options): -job_endpoint = options.view_as(PortableOptions).job_endpoint + def create_job_service(self, portable_options): Review comment: I changed it because it is actually immediately converted back to portable_options. Basically the caller takes options and converts it to portable_options, then to the create_job_service method it passes options which is again converted to portable_options. It seems to be that we could pass portable_options in the first place and simplify the code a bit. However, I don't have a strong option, I'll revert back to options. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320886) Time Spent: 50m (was: 40m) > Adding timeout to JobServer grpc calls > -- > > Key: BEAM-7933 > URL: https://issues.apache.org/jira/browse/BEAM-7933 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core >Affects Versions: 2.14.0 >Reporter: Enrico Canzonieri >Assignee: Enrico Canzonieri >Priority: Minor > Labels: portability > Time Spent: 50m > Remaining Estimate: 0h > > grpc calls to the JobServer from the Python SDK do not have timeouts. That > means that the call to pipeline.run()could hang forever if the JobServer is > not running (or failing to start). > E.g. > [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/portable_runner.py#L307] > the call to Prepare() doesn't provide any timeout value and the same applies > to other JobServer requests. > As part of this ticket we could add a default timeout of 60 seconds as the > default timeout for http client. > Additionally, we could consider adding a --job-server-request-timeout to the > [PortableOptions|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/options/pipeline_options.py#L805] > class to be used in the JobServer interactions inside probable_runner.py. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given
[ https://issues.apache.org/jira/browse/BEAM-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Valentyn Tymofieiev resolved BEAM-8324. --- Resolution: Fixed > Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 > positional arguments but 7 were given > - > > Key: BEAM-8324 > URL: https://issues.apache.org/jira/browse/BEAM-8324 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Blocker > Fix For: 2.16.0 > > Time Spent: 2h > Remaining Estimate: 0h > > {noformat} > Dataflow pipeline failed. State: FAILED, Error: > Traceback (most recent call last): > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 773, in run > self._load_main_session(self.local_staging_directory) > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 489, in _load_main_session > pickler.load_session(session_file) > File > "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", > line 287, in load_session > return dill.load_session(file_path) > File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in > load_session > module = unpickler.load() > TypeError: _create_function() takes from 2 to 6 positional arguments but 7 > were given > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given
[ https://issues.apache.org/jira/browse/BEAM-8324?focusedWorklogId=320882&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320882 ] ASF GitHub Bot logged work on BEAM-8324: Author: ASF GitHub Bot Created on: 30/Sep/19 21:27 Start Date: 30/Sep/19 21:27 Worklog Time Spent: 10m Work Description: markflyhigh commented on issue #9696: [BEAM-8324] Restrict the upper bound for dill due to incompatibility between versions 0.3.0 and 0.3.1.1. URL: https://github.com/apache/beam/pull/9696#issuecomment-536760418 This is cherry pick of https://github.com/apache/beam/pull/9695 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320882) Time Spent: 1h 50m (was: 1h 40m) > Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 > positional arguments but 7 were given > - > > Key: BEAM-8324 > URL: https://issues.apache.org/jira/browse/BEAM-8324 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Blocker > Fix For: 2.16.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > {noformat} > Dataflow pipeline failed. State: FAILED, Error: > Traceback (most recent call last): > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 773, in run > self._load_main_session(self.local_staging_directory) > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 489, in _load_main_session > pickler.load_session(session_file) > File > "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", > line 287, in load_session > return dill.load_session(file_path) > File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in > load_session > module = unpickler.load() > TypeError: _create_function() takes from 2 to 6 positional arguments but 7 > were given > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given
[ https://issues.apache.org/jira/browse/BEAM-8324?focusedWorklogId=320883&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320883 ] ASF GitHub Bot logged work on BEAM-8324: Author: ASF GitHub Bot Created on: 30/Sep/19 21:27 Start Date: 30/Sep/19 21:27 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #9696: [BEAM-8324] Restrict the upper bound for dill due to incompatibility between versions 0.3.0 and 0.3.1.1. URL: https://github.com/apache/beam/pull/9696 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320883) Time Spent: 2h (was: 1h 50m) > Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 > positional arguments but 7 were given > - > > Key: BEAM-8324 > URL: https://issues.apache.org/jira/browse/BEAM-8324 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Blocker > Fix For: 2.16.0 > > Time Spent: 2h > Remaining Estimate: 0h > > {noformat} > Dataflow pipeline failed. State: FAILED, Error: > Traceback (most recent call last): > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 773, in run > self._load_main_session(self.local_staging_directory) > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 489, in _load_main_session > pickler.load_session(session_file) > File > "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", > line 287, in load_session > return dill.load_session(file_path) > File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in > load_session > module = unpickler.load() > TypeError: _create_function() takes from 2 to 6 positional arguments but 7 > were given > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-8183) Optionally bundle multiple pipelines into a single Flink jar
[ https://issues.apache.org/jira/browse/BEAM-8183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941330#comment-16941330 ] Ankur Goenka commented on BEAM-8183: Flink has some neat feature of picking the pipeline on the fly. I don't think this is a very common usecase with Beam though. Given Beam Job Submission api work on a single pipeline at a time it will be very convoluted work flow to introduce multiple pipelines in a single jar. Will it be easier to just store pipelines as separate jar in global storage (hdfs etc) and pass the right jar at the time of pipeline submission? In case the submission happen through a service then will it be easier and less error prone to just keep these jars separately on the service and submit the right jar to flink based on the parameter? > Optionally bundle multiple pipelines into a single Flink jar > > > Key: BEAM-8183 > URL: https://issues.apache.org/jira/browse/BEAM-8183 > Project: Beam > Issue Type: New Feature > Components: runner-flink >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Labels: portability-flink > > [https://github.com/apache/beam/pull/9331#issuecomment-526734851] > "With Flink you can bundle multiple entry points into the same jar file and > specify which one to use with optional flags. It may be desirable to allow > inclusion of multiple pipelines for this tool also, although that would > require a different workflow. Absent this option, it becomes quite convoluted > for users that need the flexibility to choose which pipeline to launch at > submission time." -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-5878) Support DoFns with Keyword-only arguments in Python 3.
[ https://issues.apache.org/jira/browse/BEAM-5878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941329#comment-16941329 ] Valentyn Tymofieiev commented on BEAM-5878: --- We have restricted dill version on master due to stackfoverflow error (https://issues.apache.org/jira/browse/BEAM-8324?focusedCommentId=16941261&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16941261), thanks for reporting it. Also, https://github.com/apache/beam/pull/9696 will restrict dill version on 2.16.0, keyword-only arguments won't yet be available in 2.16.0 yet, but. We can continue to investigate the problem related to dill upgrade in this bug. > Support DoFns with Keyword-only arguments in Python 3. > -- > > Key: BEAM-5878 > URL: https://issues.apache.org/jira/browse/BEAM-5878 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: yoshiki obata >Priority: Minor > Fix For: 2.16.0 > > Time Spent: 14h 40m > Remaining Estimate: 0h > > Python 3.0 [adds a possibility|https://www.python.org/dev/peps/pep-3102/] to > define functions with keyword-only arguments. > Currently Beam does not handle them correctly. [~ruoyu] pointed out [one > place|https://github.com/apache/beam/blob/a56ce43109c97c739fa08adca45528c41e3c925c/sdks/python/apache_beam/typehints/decorators.py#L118] > in our codebase that we should fix: in Python in 3.0 inspect.getargspec() > will fail on functions with keyword-only arguments, but a new method > [inspect.getfullargspec()|https://docs.python.org/3/library/inspect.html#inspect.getfullargspec] > supports them. > There may be implications for our (best-effort) type-hints machinery. > We should also add a Py3-only unit tests that covers DoFn's with keyword-only > arguments once Beam Python 3 tests are in a good shape. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (BEAM-5878) Support DoFns with Keyword-only arguments in Python 3.
[ https://issues.apache.org/jira/browse/BEAM-5878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941329#comment-16941329 ] Valentyn Tymofieiev edited comment on BEAM-5878 at 9/30/19 9:07 PM: We have restricted dill version on master due to stackfoverflow error (https://issues.apache.org/jira/browse/BEAM-8324?focusedCommentId=16941261&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16941261), thanks for reporting it. Also, https://github.com/apache/beam/pull/9696 will restrict dill version on 2.16.0, so keyword-only arguments won't yet be available in 2.16.0 yet. We can continue to investigate the problem related to dill upgrade in this bug. was (Author: tvalentyn): We have restricted dill version on master due to stackfoverflow error (https://issues.apache.org/jira/browse/BEAM-8324?focusedCommentId=16941261&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16941261), thanks for reporting it. Also, https://github.com/apache/beam/pull/9696 will restrict dill version on 2.16.0, keyword-only arguments won't yet be available in 2.16.0 yet, but. We can continue to investigate the problem related to dill upgrade in this bug. > Support DoFns with Keyword-only arguments in Python 3. > -- > > Key: BEAM-5878 > URL: https://issues.apache.org/jira/browse/BEAM-5878 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: yoshiki obata >Priority: Minor > Fix For: 2.16.0 > > Time Spent: 14h 40m > Remaining Estimate: 0h > > Python 3.0 [adds a possibility|https://www.python.org/dev/peps/pep-3102/] to > define functions with keyword-only arguments. > Currently Beam does not handle them correctly. [~ruoyu] pointed out [one > place|https://github.com/apache/beam/blob/a56ce43109c97c739fa08adca45528c41e3c925c/sdks/python/apache_beam/typehints/decorators.py#L118] > in our codebase that we should fix: in Python in 3.0 inspect.getargspec() > will fail on functions with keyword-only arguments, but a new method > [inspect.getfullargspec()|https://docs.python.org/3/library/inspect.html#inspect.getfullargspec] > supports them. > There may be implications for our (best-effort) type-hints machinery. > We should also add a Py3-only unit tests that covers DoFn's with keyword-only > arguments once Beam Python 3 tests are in a good shape. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given
[ https://issues.apache.org/jira/browse/BEAM-8324?focusedWorklogId=320874&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320874 ] ASF GitHub Bot logged work on BEAM-8324: Author: ASF GitHub Bot Created on: 30/Sep/19 21:00 Start Date: 30/Sep/19 21:00 Worklog Time Spent: 10m Work Description: tvalentyn commented on issue #9696: [BEAM-8324] Restrict the upper bound for dill due to incompatibility between versions 0.3.0 and 0.3.1.1. URL: https://github.com/apache/beam/pull/9696#issuecomment-536750631 R: @markflyhigh This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320874) Time Spent: 1h 40m (was: 1.5h) > Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 > positional arguments but 7 were given > - > > Key: BEAM-8324 > URL: https://issues.apache.org/jira/browse/BEAM-8324 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Blocker > Fix For: 2.16.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > {noformat} > Dataflow pipeline failed. State: FAILED, Error: > Traceback (most recent call last): > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 773, in run > self._load_main_session(self.local_staging_directory) > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 489, in _load_main_session > pickler.load_session(session_file) > File > "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", > line 287, in load_session > return dill.load_session(file_path) > File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in > load_session > module = unpickler.load() > TypeError: _create_function() takes from 2 to 6 positional arguments but 7 > were given > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given
[ https://issues.apache.org/jira/browse/BEAM-8324?focusedWorklogId=320859&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320859 ] ASF GitHub Bot logged work on BEAM-8324: Author: ASF GitHub Bot Created on: 30/Sep/19 20:59 Start Date: 30/Sep/19 20:59 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #9695: [BEAM-8324] Restrict the upper bound for dill due to incompatibility between versions 0.3.0 and 0.3.1.1. URL: https://github.com/apache/beam/pull/9695 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320859) Time Spent: 1.5h (was: 1h 20m) > Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 > positional arguments but 7 were given > - > > Key: BEAM-8324 > URL: https://issues.apache.org/jira/browse/BEAM-8324 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Blocker > Fix For: 2.16.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > {noformat} > Dataflow pipeline failed. State: FAILED, Error: > Traceback (most recent call last): > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 773, in run > self._load_main_session(self.local_staging_directory) > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 489, in _load_main_session > pickler.load_session(session_file) > File > "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", > line 287, in load_session > return dill.load_session(file_path) > File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in > load_session > module = unpickler.load() > TypeError: _create_function() takes from 2 to 6 positional arguments but 7 > were given > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given
[ https://issues.apache.org/jira/browse/BEAM-8324?focusedWorklogId=320858&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320858 ] ASF GitHub Bot logged work on BEAM-8324: Author: ASF GitHub Bot Created on: 30/Sep/19 20:57 Start Date: 30/Sep/19 20:57 Worklog Time Spent: 10m Work Description: tvalentyn commented on issue #9695: [BEAM-8324] Restrict the upper bound for dill due to incompatibility between versions 0.3.0 and 0.3.1.1. URL: https://github.com/apache/beam/pull/9695#issuecomment-536749358 R: @aaltay or @markflyhigh This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320858) Time Spent: 1h 20m (was: 1h 10m) > Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 > positional arguments but 7 were given > - > > Key: BEAM-8324 > URL: https://issues.apache.org/jira/browse/BEAM-8324 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Blocker > Fix For: 2.16.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > {noformat} > Dataflow pipeline failed. State: FAILED, Error: > Traceback (most recent call last): > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 773, in run > self._load_main_session(self.local_staging_directory) > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 489, in _load_main_session > pickler.load_session(session_file) > File > "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", > line 287, in load_session > return dill.load_session(file_path) > File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in > load_session > module = unpickler.load() > TypeError: _create_function() takes from 2 to 6 positional arguments but 7 > were given > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-5840) Add unit tests for import scripts
[ https://issues.apache.org/jira/browse/BEAM-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Gryzykhin reassigned BEAM-5840: --- Assignee: (was: Mikhail Gryzykhin) > Add unit tests for import scripts > - > > Key: BEAM-5840 > URL: https://issues.apache.org/jira/browse/BEAM-5840 > Project: Beam > Issue Type: Sub-task > Components: project-management >Reporter: Scott Wegner >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > For custom import scripts, we should refactor the logic to be testable and > add some basic unit tests. > At the very least, there should be some "smoke test" that the local > deployment works. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-5923) Utilize common .pylintrc with python sdk for .test-infra/metrics
[ https://issues.apache.org/jira/browse/BEAM-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Gryzykhin reassigned BEAM-5923: --- Assignee: (was: Mikhail Gryzykhin) > Utilize common .pylintrc with python sdk for .test-infra/metrics > > > Key: BEAM-5923 > URL: https://issues.apache.org/jira/browse/BEAM-5923 > Project: Beam > Issue Type: Sub-task > Components: project-management >Reporter: Mikhail Gryzykhin >Priority: Major > > Add common linter and formatter to metrics code. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8303) Filesystems not properly registered using FileIO.write()
[ https://issues.apache.org/jira/browse/BEAM-8303?focusedWorklogId=320852&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320852 ] ASF GitHub Bot logged work on BEAM-8303: Author: ASF GitHub Bot Created on: 30/Sep/19 20:46 Start Date: 30/Sep/19 20:46 Worklog Time Spent: 10m Work Description: markflyhigh commented on issue #9688: [BEAM-8303] Ensure FileSystems registration code runs in non UDF Flink operators URL: https://github.com/apache/beam/pull/9688#issuecomment-536744819 Portable_Python failure is known and tracking in https://issues.apache.org/jira/browse/BEAM-8324. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320852) Time Spent: 40m (was: 0.5h) > Filesystems not properly registered using FileIO.write() > > > Key: BEAM-8303 > URL: https://issues.apache.org/jira/browse/BEAM-8303 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.15.0 >Reporter: Preston Koprivica >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.16.0 > > Time Spent: 40m > Remaining Estimate: 0h > > I’m getting the following error when attempting to use the FileIO apis > (beam-2.15.0) and integrating with AWS S3. I have setup the PipelineOptions > with all the relevant AWS options, so the filesystem registry **should** be > properly seeded by the time the graph is compiled and executed: > {code:java} > java.lang.IllegalArgumentException: No filesystem found for scheme s3 > at > org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.java:456) > at > org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:526) > at > org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1149) > at > org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1105) > at org.apache.beam.sdk.coders.Coder.decode(Coder.java:159) > at > org.apache.beam.sdk.transforms.join.UnionCoder.decode(UnionCoder.java:83) > at > org.apache.beam.sdk.transforms.join.UnionCoder.decode(UnionCoder.java:32) > at > org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:543) > at > org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:534) > at > org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:480) > at > org.apache.beam.runners.flink.translation.types.CoderTypeSerializer.deserialize(CoderTypeSerializer.java:93) > at > org.apache.flink.runtime.plugable.NonReusingDeserializationDelegate.read(NonReusingDeserializationDelegate.java:55) > at > org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.getNextRecord(SpillingAdaptiveSpanningRecordDeserializer.java:106) > at > org.apache.flink.runtime.io.network.api.reader.AbstractRecordReader.getNextRecord(AbstractRecordReader.java:72) > at > org.apache.flink.runtime.io.network.api.reader.MutableRecordReader.next(MutableRecordReader.java:47) > at > org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:73) > at > org.apache.flink.runtime.operators.FlatMapDriver.run(FlatMapDriver.java:107) > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:503) > at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) > at java.lang.Thread.run(Thread.java:748) > {code} > For reference, the write code resembles this: > {code:java} > FileIO.Write write = FileIO.write() > .via(ParquetIO.sink(schema)) > .to(options.getOutputDir()). // will be something like: > s3:/// > .withSuffix(".parquet"); > records.apply(String.format("Write(%s)", options.getOutputDir()), > write);{code} > The issue does not appear to be related to ParquetIO.sink(). I am able to > reliably reproduce the issue using JSON formatted records and TextIO.sink(), > as well. Moreover, AvroIO is affected if withWindowedWrites() option is > added. > Just trying some different knobs, I went ahead and set the following option: > {code:java} > write = write.withNoSpilling();{code} > This actually seemed to fix the issue, only to have it reemerge as I scaled > up the data set size. The stack trace,
[jira] [Work logged] (BEAM-8321) beam_PostCommit_PortableJar_Flink postcommit failing
[ https://issues.apache.org/jira/browse/BEAM-8321?focusedWorklogId=320847&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320847 ] ASF GitHub Bot logged work on BEAM-8321: Author: ASF GitHub Bot Created on: 30/Sep/19 20:41 Start Date: 30/Sep/19 20:41 Worklog Time Spent: 10m Work Description: ibzib commented on pull request #9680: [BEAM-8321] fix Flink portable jar test URL: https://github.com/apache/beam/pull/9680 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320847) Time Spent: 1h 20m (was: 1h 10m) > beam_PostCommit_PortableJar_Flink postcommit failing > > > Key: BEAM-8321 > URL: https://issues.apache.org/jira/browse/BEAM-8321 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > Python docker container image names/tags need updated. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-6816) More PR information in the code velocity dashboard?
[ https://issues.apache.org/jira/browse/BEAM-6816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941314#comment-16941314 ] Mikhail Gryzykhin commented on BEAM-6816: - Unassigning since I'm not actively working on this task. > More PR information in the code velocity dashboard? > --- > > Key: BEAM-6816 > URL: https://issues.apache.org/jira/browse/BEAM-6816 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Pablo Estrada >Priority: Major > > I've been using this dashboard: > http://104.154.241.245/d/code_velocity/code-velocity?orgId=1 > To triage and get PRs reviewed. Some of them have DO NOT REVIEW / or special > instructions in the title. > Perhaps it would be nice to have more information displayed on the dashboard, > so we can skip PRs that don't need to be reviewed. > As you rightly mentioned, another idea is to exclude PRs that have a specific > tag - and I think that's a good idea in fact. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8321) beam_PostCommit_PortableJar_Flink postcommit failing
[ https://issues.apache.org/jira/browse/BEAM-8321?focusedWorklogId=320846&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320846 ] ASF GitHub Bot logged work on BEAM-8321: Author: ASF GitHub Bot Created on: 30/Sep/19 20:40 Start Date: 30/Sep/19 20:40 Worklog Time Spent: 10m Work Description: ibzib commented on issue #9680: [BEAM-8321] fix Flink portable jar test URL: https://github.com/apache/beam/pull/9680#issuecomment-536742437 Failure (`org.apache.beam.runners.dataflow.worker.fn.control.TimerReceiverTest.testSingleTimerScheduling`) looks unrelated This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320846) Time Spent: 1h 10m (was: 1h) > beam_PostCommit_PortableJar_Flink postcommit failing > > > Key: BEAM-8321 > URL: https://issues.apache.org/jira/browse/BEAM-8321 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > Python docker container image names/tags need updated. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-6816) More PR information in the code velocity dashboard?
[ https://issues.apache.org/jira/browse/BEAM-6816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Gryzykhin reassigned BEAM-6816: --- Assignee: (was: Mikhail Gryzykhin) > More PR information in the code velocity dashboard? > --- > > Key: BEAM-6816 > URL: https://issues.apache.org/jira/browse/BEAM-6816 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Pablo Estrada >Priority: Major > > I've been using this dashboard: > http://104.154.241.245/d/code_velocity/code-velocity?orgId=1 > To triage and get PRs reviewed. Some of them have DO NOT REVIEW / or special > instructions in the title. > Perhaps it would be nice to have more information displayed on the dashboard, > so we can skip PRs that don't need to be reviewed. > As you rightly mentioned, another idea is to exclude PRs that have a specific > tag - and I think that's a good idea in fact. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-7969) Streaming Dataflow worker doesn't report FnAPI metrics.
[ https://issues.apache.org/jira/browse/BEAM-7969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Gryzykhin resolved BEAM-7969. - Fix Version/s: 2.17.0 Resolution: Fixed > Streaming Dataflow worker doesn't report FnAPI metrics. > --- > > Key: BEAM-7969 > URL: https://issues.apache.org/jira/browse/BEAM-7969 > Project: Beam > Issue Type: Bug > Components: java-fn-execution, runner-dataflow >Reporter: Mikhail Gryzykhin >Assignee: Mikhail Gryzykhin >Priority: Major > Fix For: 2.17.0 > > Time Spent: 7h > Remaining Estimate: 0h > > EOM -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (BEAM-7902) :beam-test-infra-metrics:test is consistently failing
[ https://issues.apache.org/jira/browse/BEAM-7902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941312#comment-16941312 ] Mikhail Gryzykhin edited comment on BEAM-7902 at 9/30/19 8:34 PM: -- Closing in favor of https://issues.apache.org/jira/browse/BEAM-8327 and https://issues.apache.org/jira/browse/BEAM-8328 was (Author: ardagan): Closing in favor of https://issues.apache.org/jira/browse/BEAM-8327 > :beam-test-infra-metrics:test is consistently failing > - > > Key: BEAM-7902 > URL: https://issues.apache.org/jira/browse/BEAM-7902 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Mark Liu >Assignee: Mikhail Gryzykhin >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > A failure gradle run from my local machine: > https://gradle.com/s/sqcwi7bo3a2nw. Stacktrace: > {code} > org.apache.beam.testinfra.metrics.ProberTests > CheckGrafanaStalenessAlerts > FAILED > java.lang.AssertionError: Input data is stale! [id:1, dashboardId:5, > dashboardUid:data-freshness, dashboardSlug:source-data-freshness, panelId:2, > name:Source Data Freshness alert, state:alerting, > newStateDate:2019-07-30T02:55:02Z, evalDate:0001-01-01T00:00:00Z, > evalData:[evalMatches:[[metric:GitHub, tags:null, > value:482.1661704504]]], executionError:, > url:/d/data-freshness/source-data-freshness] >See: http://104.154.241.245/d/data-freshness. Expression: (alert.state > == ok) > at > org.codehaus.groovy.runtime.InvokerHelper.assertFailed(InvokerHelper.java:406) > at > org.codehaus.groovy.runtime.ScriptBytecodeAdapter.assertFailed(ScriptBytecodeAdapter.java:650) > ... > {code} > From the stacktrace suggestion, I went to > http://104.154.241.245/d/data-freshness and got Github source is older than a > week, which I guess is potentially the failure reason. However, no further > instruction on what to do next. > After some investigations, I found this test `:beam-test-infra-metrics:test` > is added to > [beam_Prober_CommunityMetrics|https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_Prober_CommunityMetrics/] > but never got passed (either failed or skipped due to cache). > Is there anything wrong with Github data refresh? Should we reevaluate the > alert on dashboard or modify the test? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-7902) :beam-test-infra-metrics:test is consistently failing
[ https://issues.apache.org/jira/browse/BEAM-7902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Gryzykhin resolved BEAM-7902. - Fix Version/s: Not applicable Resolution: Fixed > :beam-test-infra-metrics:test is consistently failing > - > > Key: BEAM-7902 > URL: https://issues.apache.org/jira/browse/BEAM-7902 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Mark Liu >Assignee: Mikhail Gryzykhin >Priority: Major > Fix For: Not applicable > > Time Spent: 0.5h > Remaining Estimate: 0h > > A failure gradle run from my local machine: > https://gradle.com/s/sqcwi7bo3a2nw. Stacktrace: > {code} > org.apache.beam.testinfra.metrics.ProberTests > CheckGrafanaStalenessAlerts > FAILED > java.lang.AssertionError: Input data is stale! [id:1, dashboardId:5, > dashboardUid:data-freshness, dashboardSlug:source-data-freshness, panelId:2, > name:Source Data Freshness alert, state:alerting, > newStateDate:2019-07-30T02:55:02Z, evalDate:0001-01-01T00:00:00Z, > evalData:[evalMatches:[[metric:GitHub, tags:null, > value:482.1661704504]]], executionError:, > url:/d/data-freshness/source-data-freshness] >See: http://104.154.241.245/d/data-freshness. Expression: (alert.state > == ok) > at > org.codehaus.groovy.runtime.InvokerHelper.assertFailed(InvokerHelper.java:406) > at > org.codehaus.groovy.runtime.ScriptBytecodeAdapter.assertFailed(ScriptBytecodeAdapter.java:650) > ... > {code} > From the stacktrace suggestion, I went to > http://104.154.241.245/d/data-freshness and got Github source is older than a > week, which I guess is potentially the failure reason. However, no further > instruction on what to do next. > After some investigations, I found this test `:beam-test-infra-metrics:test` > is added to > [beam_Prober_CommunityMetrics|https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_Prober_CommunityMetrics/] > but never got passed (either failed or skipped due to cache). > Is there anything wrong with Github data refresh? Should we reevaluate the > alert on dashboard or modify the test? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-7902) :beam-test-infra-metrics:test is consistently failing
[ https://issues.apache.org/jira/browse/BEAM-7902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941312#comment-16941312 ] Mikhail Gryzykhin commented on BEAM-7902: - Closing in favor of https://issues.apache.org/jira/browse/BEAM-8327 > :beam-test-infra-metrics:test is consistently failing > - > > Key: BEAM-7902 > URL: https://issues.apache.org/jira/browse/BEAM-7902 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Mark Liu >Assignee: Mikhail Gryzykhin >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > A failure gradle run from my local machine: > https://gradle.com/s/sqcwi7bo3a2nw. Stacktrace: > {code} > org.apache.beam.testinfra.metrics.ProberTests > CheckGrafanaStalenessAlerts > FAILED > java.lang.AssertionError: Input data is stale! [id:1, dashboardId:5, > dashboardUid:data-freshness, dashboardSlug:source-data-freshness, panelId:2, > name:Source Data Freshness alert, state:alerting, > newStateDate:2019-07-30T02:55:02Z, evalDate:0001-01-01T00:00:00Z, > evalData:[evalMatches:[[metric:GitHub, tags:null, > value:482.1661704504]]], executionError:, > url:/d/data-freshness/source-data-freshness] >See: http://104.154.241.245/d/data-freshness. Expression: (alert.state > == ok) > at > org.codehaus.groovy.runtime.InvokerHelper.assertFailed(InvokerHelper.java:406) > at > org.codehaus.groovy.runtime.ScriptBytecodeAdapter.assertFailed(ScriptBytecodeAdapter.java:650) > ... > {code} > From the stacktrace suggestion, I went to > http://104.154.241.245/d/data-freshness and got Github source is older than a > week, which I guess is potentially the failure reason. However, no further > instruction on what to do next. > After some investigations, I found this test `:beam-test-infra-metrics:test` > is added to > [beam_Prober_CommunityMetrics|https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_Prober_CommunityMetrics/] > but never got passed (either failed or skipped due to cache). > Is there anything wrong with Github data refresh? Should we reevaluate the > alert on dashboard or modify the test? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-8328) remove :beam-test-infra-metrics:test from build target.
Mikhail Gryzykhin created BEAM-8328: --- Summary: remove :beam-test-infra-metrics:test from build target. Key: BEAM-8328 URL: https://issues.apache.org/jira/browse/BEAM-8328 Project: Beam Issue Type: Bug Components: project-management Reporter: Mikhail Gryzykhin -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8312) Flink portable pipeline jars do not need to stage artifacts remotely
[ https://issues.apache.org/jira/browse/BEAM-8312?focusedWorklogId=320837&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320837 ] ASF GitHub Bot logged work on BEAM-8312: Author: ASF GitHub Bot Created on: 30/Sep/19 20:32 Start Date: 30/Sep/19 20:32 Worklog Time Spent: 10m Work Description: ibzib commented on pull request #9681: [BEAM-8312] ArtifactRetrievalService serving artifacts from jar. URL: https://github.com/apache/beam/pull/9681#discussion_r329772101 ## File path: runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/JavaFilesystemArtifactStagingService.java ## @@ -0,0 +1,92 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.fnexecution.artifact; + +import java.io.File; +import java.io.IOException; +import java.nio.channels.Channels; +import java.nio.channels.WritableByteChannel; +import java.nio.file.FileSystem; +import java.nio.file.Files; +import java.nio.file.Path; +import java.util.Comparator; +import java.util.stream.Stream; +import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc; + +/** + * An {@link ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase} that loads artifacts into a + * Java FileSystem. + */ +public class JavaFilesystemArtifactStagingService extends AbstractArtifactStagingService { Review comment: I'm not sure why this staging service is necessary. `PortablePipelineJarCreator` already writes artifacts to a jar as if "staging" them. The Flink runner then copies everything from the application jar's classpath to the task managers, so we should be able to retrieve artifacts from the classpath directly without going through an extra re-staging step. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320837) > Flink portable pipeline jars do not need to stage artifacts remotely > > > Key: BEAM-8312 > URL: https://issues.apache.org/jira/browse/BEAM-8312 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Labels: portability-flink > Time Spent: 20m > Remaining Estimate: 0h > > Currently, Flink job jars re-stage all artifacts at runtime (on the > JobManager) by using the usual BeamFileSystemArtifactRetrievalService [1]. > However, since the manifest and all the artifacts live on the classpath of > the jar, and everything from the classpath is copied to the Flink workers > anyway [2], it should not be necessary to do additional artifact staging. We > could replace BeamFileSystemArtifactRetrievalService in this case with a > simple ArtifactRetrievalService that just pulls the artifacts from the > classpath. > > [1] > [https://github.com/apache/beam/blob/340c3202b1e5824b959f5f9f626e4c7c7842a3cb/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactRetrievalService.java] > [2] > [https://github.com/apache/beam/blob/2f1b56ccc506054e40afe4793a8b556e872e1865/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkExecutionEnvironments.java#L93] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8312) Flink portable pipeline jars do not need to stage artifacts remotely
[ https://issues.apache.org/jira/browse/BEAM-8312?focusedWorklogId=320836&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320836 ] ASF GitHub Bot logged work on BEAM-8312: Author: ASF GitHub Bot Created on: 30/Sep/19 20:32 Start Date: 30/Sep/19 20:32 Worklog Time Spent: 10m Work Description: ibzib commented on pull request #9681: [BEAM-8312] ArtifactRetrievalService serving artifacts from jar. URL: https://github.com/apache/beam/pull/9681#discussion_r329756483 ## File path: runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/AbstractArtifactRetrievalService.java ## @@ -0,0 +1,199 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.fnexecution.artifact; + +import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument; + +import java.io.IOException; +import java.io.InputStream; +import java.nio.charset.StandardCharsets; +import java.util.List; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.TimeUnit; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest; +import org.apache.beam.model.jobmanagement.v1.ArtifactRetrievalServiceGrpc; +import org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.ByteString; +import org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.util.JsonFormat; +import org.apache.beam.vendor.grpc.v1p21p0.io.grpc.Status; +import org.apache.beam.vendor.grpc.v1p21p0.io.grpc.StatusRuntimeException; +import org.apache.beam.vendor.grpc.v1p21p0.io.grpc.stub.StreamObserver; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Strings; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.Cache; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.CacheBuilder; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.hash.Hasher; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.hash.Hashing; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.io.ByteStreams; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * An {@link ArtifactRetrievalService} that handles everything aside from actually opening the + * backing resources. Review comment: :+1: This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320836) Time Spent: 20m (was: 10m) > Flink portable pipeline jars do not need to stage artifacts remotely > > > Key: BEAM-8312 > URL: https://issues.apache.org/jira/browse/BEAM-8312 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Labels: portability-flink > Time Spent: 20m > Remaining Estimate: 0h > > Currently, Flink job jars re-stage all artifacts at runtime (on the > JobManager) by using the usual BeamFileSystemArtifactRetrievalService [1]. > However, since the manifest and all the artifacts live on the classpath of > the jar, and everything from the classpath is copied to the Flink workers > anyway [2], it should not be necessary to do additional artifact staging. We > could replace BeamFileSystemArtifactRetrievalService in this case with a > simple ArtifactRetrievalService that just pulls the artifacts from the > classpath. > > [1] > [https://github.com/apache/beam/blob/340c3202b1e5824b959f5f9f626e4c7c7842a3cb/runners/java-fn-execution/src/main/java/org/apache/bea
[jira] [Commented] (BEAM-8327) beam_Prober_CommunityMetrics hits cache giving wrong results
[ https://issues.apache.org/jira/browse/BEAM-8327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941311#comment-16941311 ] Mikhail Gryzykhin commented on BEAM-8327: - Came up from this failure. https://issues.apache.org/jira/browse/BEAM-7902 > beam_Prober_CommunityMetrics hits cache giving wrong results > > > Key: BEAM-8327 > URL: https://issues.apache.org/jira/browse/BEAM-8327 > Project: Beam > Issue Type: Bug > Components: project-management >Reporter: Mikhail Gryzykhin >Priority: Major > > We need to fix beam_Prober_CommunityMetrics target to not hit cache. It > always fetches fresh data from website even though binaries are the same. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-8327) beam_Prober_CommunityMetrics hits cache giving wrong results
Mikhail Gryzykhin created BEAM-8327: --- Summary: beam_Prober_CommunityMetrics hits cache giving wrong results Key: BEAM-8327 URL: https://issues.apache.org/jira/browse/BEAM-8327 Project: Beam Issue Type: Bug Components: project-management Reporter: Mikhail Gryzykhin We need to fix beam_Prober_CommunityMetrics target to not hit cache. It always fetches fresh data from website even though binaries are the same. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8312) Flink portable pipeline jars do not need to stage artifacts remotely
[ https://issues.apache.org/jira/browse/BEAM-8312?focusedWorklogId=320838&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320838 ] ASF GitHub Bot logged work on BEAM-8312: Author: ASF GitHub Bot Created on: 30/Sep/19 20:32 Start Date: 30/Sep/19 20:32 Worklog Time Spent: 10m Work Description: ibzib commented on pull request #9681: [BEAM-8312] ArtifactRetrievalService serving artifacts from jar. URL: https://github.com/apache/beam/pull/9681#discussion_r329758955 ## File path: runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactRetrievalService.java ## @@ -53,129 +36,45 @@ * the artifact layout and retrieval token format produced by {@link * BeamFileSystemArtifactStagingService}. */ -public class BeamFileSystemArtifactRetrievalService -extends ArtifactRetrievalServiceGrpc.ArtifactRetrievalServiceImplBase -implements ArtifactRetrievalService { +public class BeamFileSystemArtifactRetrievalService extends AbstractArtifactRetrievalService { private static final Logger LOG = LoggerFactory.getLogger(BeamFileSystemArtifactRetrievalService.class); - private static final int ARTIFACT_CHUNK_SIZE_BYTES = 2 << 20; // 2MB + private static final Cache MANIFEST_CACHE = + CacheBuilder.newBuilder() + .expireAfterAccess(1, TimeUnit.HOURS /* arbitrary */) + .maximumSize(100 /* arbitrary */) + .build(); + + public BeamFileSystemArtifactRetrievalService() { +super(MANIFEST_CACHE); + } public static BeamFileSystemArtifactRetrievalService create() { return new BeamFileSystemArtifactRetrievalService(); } @Override - public void getManifest( - ArtifactApi.GetManifestRequest request, - StreamObserver responseObserver) { -final String token = request.getRetrievalToken(); -if (Strings.isNullOrEmpty(token)) { - throw new StatusRuntimeException( - Status.INVALID_ARGUMENT.withDescription("Empty artifact token")); -} + public InputStream openUri(String retrievalToken, String uri) throws IOException { +ResourceId artifactResourceId = FileSystems.matchNewResource(uri, false /* is directory */); Review comment: Nit: I know this was like this before, but "is directory" is ambiguous -- is this trying to say "false, meaning it's a directory," or "is directory = false"? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320838) > Flink portable pipeline jars do not need to stage artifacts remotely > > > Key: BEAM-8312 > URL: https://issues.apache.org/jira/browse/BEAM-8312 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Labels: portability-flink > Time Spent: 20m > Remaining Estimate: 0h > > Currently, Flink job jars re-stage all artifacts at runtime (on the > JobManager) by using the usual BeamFileSystemArtifactRetrievalService [1]. > However, since the manifest and all the artifacts live on the classpath of > the jar, and everything from the classpath is copied to the Flink workers > anyway [2], it should not be necessary to do additional artifact staging. We > could replace BeamFileSystemArtifactRetrievalService in this case with a > simple ArtifactRetrievalService that just pulls the artifacts from the > classpath. > > [1] > [https://github.com/apache/beam/blob/340c3202b1e5824b959f5f9f626e4c7c7842a3cb/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactRetrievalService.java] > [2] > [https://github.com/apache/beam/blob/2f1b56ccc506054e40afe4793a8b556e872e1865/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkExecutionEnvironments.java#L93] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8312) Flink portable pipeline jars do not need to stage artifacts remotely
[ https://issues.apache.org/jira/browse/BEAM-8312?focusedWorklogId=320839&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320839 ] ASF GitHub Bot logged work on BEAM-8312: Author: ASF GitHub Bot Created on: 30/Sep/19 20:32 Start Date: 30/Sep/19 20:32 Worklog Time Spent: 10m Work Description: ibzib commented on pull request #9681: [BEAM-8312] ArtifactRetrievalService serving artifacts from jar. URL: https://github.com/apache/beam/pull/9681#discussion_r329763525 ## File path: runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/artifact/ClassLoaderArtifactServiceTest.java ## @@ -0,0 +1,412 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.fnexecution.artifact; + +import com.google.common.base.Charsets; +import com.google.common.collect.ImmutableMap; +import java.io.ByteArrayOutputStream; +import java.io.FileOutputStream; +import java.io.IOException; +import java.net.URI; +import java.net.URL; +import java.net.URLClassLoader; +import java.nio.charset.Charset; +import java.nio.file.FileSystem; +import java.nio.file.FileSystems; +import java.nio.file.Path; +import java.nio.file.Paths; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; +import java.util.Map; +import java.util.concurrent.CompletableFuture; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.TimeoutException; +import java.util.zip.ZipEntry; +import java.util.zip.ZipOutputStream; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi; +import org.apache.beam.model.jobmanagement.v1.ArtifactRetrievalServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc; +import org.apache.beam.runners.fnexecution.GrpcFnServer; +import org.apache.beam.runners.fnexecution.InProcessServerFactory; +import org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.ByteString; +import org.apache.beam.vendor.grpc.v1p21p0.io.grpc.ManagedChannel; +import org.apache.beam.vendor.grpc.v1p21p0.io.grpc.inprocess.InProcessChannelBuilder; +import org.apache.beam.vendor.grpc.v1p21p0.io.grpc.stub.StreamObserver; +import org.junit.Assert; +import org.junit.Rule; +import org.junit.Test; +import org.junit.rules.TemporaryFolder; +import org.junit.runner.RunWith; +import org.junit.runners.JUnit4; + +/** + * Tests for {@link ClassLoaderArtifactRetrievalService} and {@link + * JavaFilesystemArtifactStagingService}. + */ +@RunWith(JUnit4.class) +public class ClassLoaderArtifactServiceTest { + + @Rule public TemporaryFolder tempFolder = new TemporaryFolder(); + + private static final int ARTIFACT_CHUNK_SIZE = 100; + + private static final Charset BIJECTIVE_CHARSET = Charsets.ISO_8859_1; + + public interface ArtifactServicePair extends AutoCloseable { + +public String getStagingToken(String nonce); + +public ArtifactStagingServiceGrpc.ArtifactStagingServiceStub createStagingStub() +throws Exception; + +public ArtifactStagingServiceGrpc.ArtifactStagingServiceBlockingStub createStagingBlockingStub() +throws Exception; + +public ArtifactRetrievalServiceGrpc.ArtifactRetrievalServiceStub createRetrievalStub() +throws Exception; + +public ArtifactRetrievalServiceGrpc.ArtifactRetrievalServiceBlockingStub +createRetrievalBlockingStub() throws Exception; + } + + private ArtifactServicePair classLoaderService() throws IOException { +return new ArtifactServicePair() { + + JavaFilesystemArtifactStagingService stagingService; + GrpcFnServer stagingServer; + ClassLoaderArtifactRetrievalService retrievalService; + GrpcFnServer retrievalServer; + + ArtifactStagingServiceGrpc.ArtifactStagingServiceStub stagingStub; + ArtifactStagingServiceGrpc.ArtifactStagingServiceBlockingStub stagingBlockingStub; + ArtifactRetrievalServiceGrpc.ArtifactRetrievalServiceStub retrievalStub; + ArtifactRetrievalServiceGrpc.ArtifactRetrievalServiceBlockingStub retrievalBlockingStub; + + Path jarPath = Paths.get(tempFolder.ne
[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.
[ https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=320835&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320835 ] ASF GitHub Bot logged work on BEAM-5428: Author: ASF GitHub Bot Created on: 30/Sep/19 20:30 Start Date: 30/Sep/19 20:30 Worklog Time Spent: 10m Work Description: mxm commented on issue #9418: [BEAM-5428] Implement cross-bundle user state caching in the Python SDK URL: https://github.com/apache/beam/pull/9418#issuecomment-536738137 Python tests broken on master, see https://builds.apache.org/job/beam_PreCommit_Python_Cron/ and https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320835) Time Spent: 28h 20m (was: 28h 10m) > Implement cross-bundle state caching. > - > > Key: BEAM-5428 > URL: https://issues.apache.org/jira/browse/BEAM-5428 > Project: Beam > Issue Type: Improvement > Components: sdk-py-harness >Reporter: Robert Bradshaw >Assignee: Maximilian Michels >Priority: Major > Time Spent: 28h 20m > Remaining Estimate: 0h > > Tech spec: > [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m] > Relevant document: > [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit] > Mailing list link: > [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given
[ https://issues.apache.org/jira/browse/BEAM-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941287#comment-16941287 ] Valentyn Tymofieiev commented on BEAM-8324: --- Sent https://github.com/apache/beam/pull/9695. See also: https://github.com/uqfoundation/dill/issues/341. > Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 > positional arguments but 7 were given > - > > Key: BEAM-8324 > URL: https://issues.apache.org/jira/browse/BEAM-8324 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Blocker > Fix For: 2.16.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > {noformat} > Dataflow pipeline failed. State: FAILED, Error: > Traceback (most recent call last): > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 773, in run > self._load_main_session(self.local_staging_directory) > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 489, in _load_main_session > pickler.load_session(session_file) > File > "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", > line 287, in load_session > return dill.load_session(file_path) > File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in > load_session > module = unpickler.load() > TypeError: _create_function() takes from 2 to 6 positional arguments but 7 > were given > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given
[ https://issues.apache.org/jira/browse/BEAM-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941284#comment-16941284 ] Valentyn Tymofieiev commented on BEAM-8324: --- I think 0.3.0. will be a safer option for now. We can continue investigate 0.3.1.1. changes in https://issues.apache.org/jira/browse/BEAM-5878. > Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 > positional arguments but 7 were given > - > > Key: BEAM-8324 > URL: https://issues.apache.org/jira/browse/BEAM-8324 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Blocker > Fix For: 2.16.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > {noformat} > Dataflow pipeline failed. State: FAILED, Error: > Traceback (most recent call last): > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 773, in run > self._load_main_session(self.local_staging_directory) > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 489, in _load_main_session > pickler.load_session(session_file) > File > "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", > line 287, in load_session > return dill.load_session(file_path) > File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in > load_session > module = unpickler.load() > TypeError: _create_function() takes from 2 to 6 positional arguments but 7 > were given > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given
[ https://issues.apache.org/jira/browse/BEAM-8324?focusedWorklogId=320804&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320804 ] ASF GitHub Bot logged work on BEAM-8324: Author: ASF GitHub Bot Created on: 30/Sep/19 19:33 Start Date: 30/Sep/19 19:33 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #9696: [BEAM-8324] Restrict the upper bound for dill due to incompatibility between versions 0.3.0 and 0.3.1.1. URL: https://github.com/apache/beam/pull/9696 Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.
[jira] [Work logged] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given
[ https://issues.apache.org/jira/browse/BEAM-8324?focusedWorklogId=320791&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320791 ] ASF GitHub Bot logged work on BEAM-8324: Author: ASF GitHub Bot Created on: 30/Sep/19 19:28 Start Date: 30/Sep/19 19:28 Worklog Time Spent: 10m Work Description: tvalentyn commented on issue #9693: [BEAM-8324] Restrict the upper bound for dill due to incompatibility between versions 0.3.0 and 0.3.1.1. URL: https://github.com/apache/beam/pull/9693#issuecomment-536715145 Closing in favor of https://github.com/apache/beam/pull/9695 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320791) Time Spent: 40m (was: 0.5h) > Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 > positional arguments but 7 were given > - > > Key: BEAM-8324 > URL: https://issues.apache.org/jira/browse/BEAM-8324 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Blocker > Fix For: 2.16.0 > > Time Spent: 40m > Remaining Estimate: 0h > > {noformat} > Dataflow pipeline failed. State: FAILED, Error: > Traceback (most recent call last): > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 773, in run > self._load_main_session(self.local_staging_directory) > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 489, in _load_main_session > pickler.load_session(session_file) > File > "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", > line 287, in load_session > return dill.load_session(file_path) > File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in > load_session > module = unpickler.load() > TypeError: _create_function() takes from 2 to 6 positional arguments but 7 > were given > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given
[ https://issues.apache.org/jira/browse/BEAM-8324?focusedWorklogId=320793&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320793 ] ASF GitHub Bot logged work on BEAM-8324: Author: ASF GitHub Bot Created on: 30/Sep/19 19:28 Start Date: 30/Sep/19 19:28 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #9694: [BEAM-8324] Restrict the upper bound for dill due to incompatibility between versions 0.3.0 and 0.3.1.1. URL: https://github.com/apache/beam/pull/9694 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320793) Time Spent: 1h (was: 50m) > Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 > positional arguments but 7 were given > - > > Key: BEAM-8324 > URL: https://issues.apache.org/jira/browse/BEAM-8324 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Blocker > Fix For: 2.16.0 > > Time Spent: 1h > Remaining Estimate: 0h > > {noformat} > Dataflow pipeline failed. State: FAILED, Error: > Traceback (most recent call last): > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 773, in run > self._load_main_session(self.local_staging_directory) > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 489, in _load_main_session > pickler.load_session(session_file) > File > "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", > line 287, in load_session > return dill.load_session(file_path) > File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in > load_session > module = unpickler.load() > TypeError: _create_function() takes from 2 to 6 positional arguments but 7 > were given > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given
[ https://issues.apache.org/jira/browse/BEAM-8324?focusedWorklogId=320792&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320792 ] ASF GitHub Bot logged work on BEAM-8324: Author: ASF GitHub Bot Created on: 30/Sep/19 19:28 Start Date: 30/Sep/19 19:28 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #9693: [BEAM-8324] Restrict the upper bound for dill due to incompatibility between versions 0.3.0 and 0.3.1.1. URL: https://github.com/apache/beam/pull/9693 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320792) Time Spent: 50m (was: 40m) > Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 > positional arguments but 7 were given > - > > Key: BEAM-8324 > URL: https://issues.apache.org/jira/browse/BEAM-8324 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Blocker > Fix For: 2.16.0 > > Time Spent: 50m > Remaining Estimate: 0h > > {noformat} > Dataflow pipeline failed. State: FAILED, Error: > Traceback (most recent call last): > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 773, in run > self._load_main_session(self.local_staging_directory) > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 489, in _load_main_session > pickler.load_session(session_file) > File > "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", > line 287, in load_session > return dill.load_session(file_path) > File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in > load_session > module = unpickler.load() > TypeError: _create_function() takes from 2 to 6 positional arguments but 7 > were given > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given
[ https://issues.apache.org/jira/browse/BEAM-8324?focusedWorklogId=320790&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320790 ] ASF GitHub Bot logged work on BEAM-8324: Author: ASF GitHub Bot Created on: 30/Sep/19 19:27 Start Date: 30/Sep/19 19:27 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #9695: [BEAM-8324] Restrict the upper bound for dill due to incompatibility between versions 0.3.0 and 0.3.1.1. URL: https://github.com/apache/beam/pull/9695 Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.
[jira] [Commented] (BEAM-8303) Filesystems not properly registered using FileIO.write()
[ https://issues.apache.org/jira/browse/BEAM-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941273#comment-16941273 ] Preston Koprivica commented on BEAM-8303: - [~mxm] [~markflyhigh] I was able to test the fix by pulling Max's branch. I can verify that with the fix I'm no longer seeing the original error. Thanks so much for diagnosing and thanks for the quick turnaround. > Filesystems not properly registered using FileIO.write() > > > Key: BEAM-8303 > URL: https://issues.apache.org/jira/browse/BEAM-8303 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.15.0 >Reporter: Preston Koprivica >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.16.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > I’m getting the following error when attempting to use the FileIO apis > (beam-2.15.0) and integrating with AWS S3. I have setup the PipelineOptions > with all the relevant AWS options, so the filesystem registry **should** be > properly seeded by the time the graph is compiled and executed: > {code:java} > java.lang.IllegalArgumentException: No filesystem found for scheme s3 > at > org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.java:456) > at > org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:526) > at > org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1149) > at > org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1105) > at org.apache.beam.sdk.coders.Coder.decode(Coder.java:159) > at > org.apache.beam.sdk.transforms.join.UnionCoder.decode(UnionCoder.java:83) > at > org.apache.beam.sdk.transforms.join.UnionCoder.decode(UnionCoder.java:32) > at > org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:543) > at > org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:534) > at > org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:480) > at > org.apache.beam.runners.flink.translation.types.CoderTypeSerializer.deserialize(CoderTypeSerializer.java:93) > at > org.apache.flink.runtime.plugable.NonReusingDeserializationDelegate.read(NonReusingDeserializationDelegate.java:55) > at > org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.getNextRecord(SpillingAdaptiveSpanningRecordDeserializer.java:106) > at > org.apache.flink.runtime.io.network.api.reader.AbstractRecordReader.getNextRecord(AbstractRecordReader.java:72) > at > org.apache.flink.runtime.io.network.api.reader.MutableRecordReader.next(MutableRecordReader.java:47) > at > org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:73) > at > org.apache.flink.runtime.operators.FlatMapDriver.run(FlatMapDriver.java:107) > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:503) > at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) > at java.lang.Thread.run(Thread.java:748) > {code} > For reference, the write code resembles this: > {code:java} > FileIO.Write write = FileIO.write() > .via(ParquetIO.sink(schema)) > .to(options.getOutputDir()). // will be something like: > s3:/// > .withSuffix(".parquet"); > records.apply(String.format("Write(%s)", options.getOutputDir()), > write);{code} > The issue does not appear to be related to ParquetIO.sink(). I am able to > reliably reproduce the issue using JSON formatted records and TextIO.sink(), > as well. Moreover, AvroIO is affected if withWindowedWrites() option is > added. > Just trying some different knobs, I went ahead and set the following option: > {code:java} > write = write.withNoSpilling();{code} > This actually seemed to fix the issue, only to have it reemerge as I scaled > up the data set size. The stack trace, while very similar, reads: > {code:java} > java.lang.IllegalArgumentException: No filesystem found for scheme s3 > at > org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.java:456) > at > org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:526) > at > org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1149) > at > org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1105) > at org.apache.beam.sdk.coders.Coder.decode(Coder.java:159) > at org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:82) > at org.apache.beam.sdk.coders.KvCoder.decode(KvCo
[jira] [Commented] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given
[ https://issues.apache.org/jira/browse/BEAM-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941261#comment-16941261 ] Valentyn Tymofieiev commented on BEAM-8324: --- I am investigating whether this should be <0.3.1 or <0.3.2. It appears that 0.3.1.1. causes stackoverflow errors on master in one of the tests on Python 3.7, when running through tox. {noformat} test_remote_runner_display_data (apache_beam.runners.dataflow.dataflow_runner_test.DataflowRunnerTest) ... Fatal Python error: Cannot recover from stack overflow. Current thread 0x7fc19e177740 (most recent call first): File "/usr/local/google/home/valentyn/projects/beam/beam/beam/sdks/python/target/.tox/py37-gcp/lib/python3.7/site-packages/dill/_dill.py", line 382 in get File "/usr/lib/python3.7/pickle.py", line 502 in save File "/usr/lib/python3.7/pickle.py", line 771 in save_tuple File "/usr/lib/python3.7/pickle.py", line 504 in save File "/usr/lib/python3.7/pickle.py", line 786 in save_tuple File "/usr/lib/python3.7/pickle.py", line 504 in save File "/usr/lib/python3.7/pickle.py", line 638 in save_reduce ... {noformat} However there is no error if we run the test by itself: {noformat} python ./setup.py test -s apache_beam.runners.dataflow.dataflow_runner_test.DataflowRunnerTest.test_remote_runner_display_data test_remote_runner_display_data (apache_beam.runners.dataflow.dataflow_runner_test.DataflowRunnerTest) ... ok -- Ran 1 test in 0.063s OK {noformat} > Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 > positional arguments but 7 were given > - > > Key: BEAM-8324 > URL: https://issues.apache.org/jira/browse/BEAM-8324 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Blocker > Fix For: 2.16.0 > > Time Spent: 20m > Remaining Estimate: 0h > > {noformat} > Dataflow pipeline failed. State: FAILED, Error: > Traceback (most recent call last): > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 773, in run > self._load_main_session(self.local_staging_directory) > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 489, in _load_main_session > pickler.load_session(session_file) > File > "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", > line 287, in load_session > return dill.load_session(file_path) > File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in > load_session > module = unpickler.load() > TypeError: _create_function() takes from 2 to 6 positional arguments but 7 > were given > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given
[ https://issues.apache.org/jira/browse/BEAM-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941248#comment-16941248 ] Ahmet Altay commented on BEAM-8324: --- Let's put an upperbound. How about dill>=0.3.1.1,<0.3.2. I believe this impacts 2.16.0 release. See: https://github.com/apache/beam/blob/release-2.16.0/sdks/python/setup.py > Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 > positional arguments but 7 were given > - > > Key: BEAM-8324 > URL: https://issues.apache.org/jira/browse/BEAM-8324 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Major > Fix For: 2.16.0 > > Time Spent: 20m > Remaining Estimate: 0h > > {noformat} > Dataflow pipeline failed. State: FAILED, Error: > Traceback (most recent call last): > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 773, in run > self._load_main_session(self.local_staging_directory) > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 489, in _load_main_session > pickler.load_session(session_file) > File > "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", > line 287, in load_session > return dill.load_session(file_path) > File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in > load_session > module = unpickler.load() > TypeError: _create_function() takes from 2 to 6 positional arguments but 7 > were given > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given
[ https://issues.apache.org/jira/browse/BEAM-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Altay updated BEAM-8324: -- Priority: Blocker (was: Major) > Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 > positional arguments but 7 were given > - > > Key: BEAM-8324 > URL: https://issues.apache.org/jira/browse/BEAM-8324 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Blocker > Fix For: 2.16.0 > > Time Spent: 20m > Remaining Estimate: 0h > > {noformat} > Dataflow pipeline failed. State: FAILED, Error: > Traceback (most recent call last): > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 773, in run > self._load_main_session(self.local_staging_directory) > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 489, in _load_main_session > pickler.load_session(session_file) > File > "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", > line 287, in load_session > return dill.load_session(file_path) > File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in > load_session > module = unpickler.load() > TypeError: _create_function() takes from 2 to 6 positional arguments but 7 > were given > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-8093) py36-gcp is missing from the current tox.ini
[ https://issues.apache.org/jira/browse/BEAM-8093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udi Meiri reassigned BEAM-8093: --- Assignee: Luke Cwik > py36-gcp is missing from the current tox.ini > > > Key: BEAM-8093 > URL: https://issues.apache.org/jira/browse/BEAM-8093 > Project: Beam > Issue Type: Sub-task > Components: testing >Affects Versions: 2.16.0 >Reporter: Valentyn Tymofieiev >Assignee: Luke Cwik >Priority: Major > Time Spent: 2h 10m > Remaining Estimate: 0h > > cc: [~udim] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-8093) py36-gcp is missing from the current tox.ini
[ https://issues.apache.org/jira/browse/BEAM-8093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udi Meiri resolved BEAM-8093. - Fix Version/s: Not applicable Resolution: Fixed > py36-gcp is missing from the current tox.ini > > > Key: BEAM-8093 > URL: https://issues.apache.org/jira/browse/BEAM-8093 > Project: Beam > Issue Type: Sub-task > Components: testing >Affects Versions: 2.16.0 >Reporter: Valentyn Tymofieiev >Assignee: Luke Cwik >Priority: Major > Fix For: Not applicable > > Time Spent: 2h 10m > Remaining Estimate: 0h > > cc: [~udim] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-6923) OOM errors in jobServer when using GCS artifactDir
[ https://issues.apache.org/jira/browse/BEAM-6923?focusedWorklogId=320736&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320736 ] ASF GitHub Bot logged work on BEAM-6923: Author: ASF GitHub Bot Created on: 30/Sep/19 18:28 Start Date: 30/Sep/19 18:28 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #9647: [BEAM-6923] limit gcs buffer size to 1MB for artifact upload URL: https://github.com/apache/beam/pull/9647 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320736) Time Spent: 3h (was: 2h 50m) > OOM errors in jobServer when using GCS artifactDir > -- > > Key: BEAM-6923 > URL: https://issues.apache.org/jira/browse/BEAM-6923 > Project: Beam > Issue Type: Bug > Components: sdk-java-harness >Reporter: Lukasz Gajowy >Assignee: Ankur Goenka >Priority: Major > Attachments: Instance counts.png, Paths to GC root.png, > Telemetries.png, beam6923-flink156.m4v, beam6923flink182.m4v, heapdump > size-sorted.png > > Time Spent: 3h > Remaining Estimate: 0h > > When starting jobServer with artifactDir pointing to a GCS bucket: > {code:java} > ./gradlew :beam-runners-flink_2.11-job-server:runShadow > -PflinkMasterUrl=localhost:8081 -PartifactsDir=gs://the-bucket{code} > and running a Java portable pipeline with the following, portability related > pipeline options: > {code:java} > --runner=PortableRunner --jobEndpoint=localhost:8099 > --defaultEnvironmentType=DOCKER > --defaultEnvironmentConfig=gcr.io//java:latest'{code} > > I'm facing a series of OOM errors, like this: > {code:java} > Exception in thread "grpc-default-executor-3" java.lang.OutOfMemoryError: > Java heap space > at > com.google.api.client.googleapis.media.MediaHttpUploader.buildContentChunk(MediaHttpUploader.java:606) > at > com.google.api.client.googleapis.media.MediaHttpUploader.resumableUpload(MediaHttpUploader.java:408) > at > com.google.api.client.googleapis.media.MediaHttpUploader.upload(MediaHttpUploader.java:336) > at > com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:508) > at > com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:432) > at > com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:549) > at > com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel$UploadOperation.call(AbstractGoogleAsyncWriteChannel.java:301) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745){code} > > This does not happen when I'm using a local filesystem for the artifact > staging location. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given
[ https://issues.apache.org/jira/browse/BEAM-8324?focusedWorklogId=320707&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320707 ] ASF GitHub Bot logged work on BEAM-8324: Author: ASF GitHub Bot Created on: 30/Sep/19 17:52 Start Date: 30/Sep/19 17:52 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #9694: [BEAM-8324] Restrict the upper bound for dill due to incompatibility between versions 0.3.0 and 0.3.1.1. URL: https://github.com/apache/beam/pull/9694 Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py
[jira] [Commented] (BEAM-8303) Filesystems not properly registered using FileIO.write()
[ https://issues.apache.org/jira/browse/BEAM-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941177#comment-16941177 ] Mark Liu commented on BEAM-8303: Do you have ETA for the fix? I'm ready to cut a RC. > Filesystems not properly registered using FileIO.write() > > > Key: BEAM-8303 > URL: https://issues.apache.org/jira/browse/BEAM-8303 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.15.0 >Reporter: Preston Koprivica >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.16.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > I’m getting the following error when attempting to use the FileIO apis > (beam-2.15.0) and integrating with AWS S3. I have setup the PipelineOptions > with all the relevant AWS options, so the filesystem registry **should** be > properly seeded by the time the graph is compiled and executed: > {code:java} > java.lang.IllegalArgumentException: No filesystem found for scheme s3 > at > org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.java:456) > at > org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:526) > at > org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1149) > at > org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1105) > at org.apache.beam.sdk.coders.Coder.decode(Coder.java:159) > at > org.apache.beam.sdk.transforms.join.UnionCoder.decode(UnionCoder.java:83) > at > org.apache.beam.sdk.transforms.join.UnionCoder.decode(UnionCoder.java:32) > at > org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:543) > at > org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:534) > at > org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:480) > at > org.apache.beam.runners.flink.translation.types.CoderTypeSerializer.deserialize(CoderTypeSerializer.java:93) > at > org.apache.flink.runtime.plugable.NonReusingDeserializationDelegate.read(NonReusingDeserializationDelegate.java:55) > at > org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.getNextRecord(SpillingAdaptiveSpanningRecordDeserializer.java:106) > at > org.apache.flink.runtime.io.network.api.reader.AbstractRecordReader.getNextRecord(AbstractRecordReader.java:72) > at > org.apache.flink.runtime.io.network.api.reader.MutableRecordReader.next(MutableRecordReader.java:47) > at > org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:73) > at > org.apache.flink.runtime.operators.FlatMapDriver.run(FlatMapDriver.java:107) > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:503) > at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) > at java.lang.Thread.run(Thread.java:748) > {code} > For reference, the write code resembles this: > {code:java} > FileIO.Write write = FileIO.write() > .via(ParquetIO.sink(schema)) > .to(options.getOutputDir()). // will be something like: > s3:/// > .withSuffix(".parquet"); > records.apply(String.format("Write(%s)", options.getOutputDir()), > write);{code} > The issue does not appear to be related to ParquetIO.sink(). I am able to > reliably reproduce the issue using JSON formatted records and TextIO.sink(), > as well. Moreover, AvroIO is affected if withWindowedWrites() option is > added. > Just trying some different knobs, I went ahead and set the following option: > {code:java} > write = write.withNoSpilling();{code} > This actually seemed to fix the issue, only to have it reemerge as I scaled > up the data set size. The stack trace, while very similar, reads: > {code:java} > java.lang.IllegalArgumentException: No filesystem found for scheme s3 > at > org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.java:456) > at > org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:526) > at > org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1149) > at > org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1105) > at org.apache.beam.sdk.coders.Coder.decode(Coder.java:159) > at org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:82) > at org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:36) > at > org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:543) > at > org.apache.beam.sdk.util.WindowedValue$FullWindowedV
[jira] [Work logged] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given
[ https://issues.apache.org/jira/browse/BEAM-8324?focusedWorklogId=320702&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320702 ] ASF GitHub Bot logged work on BEAM-8324: Author: ASF GitHub Bot Created on: 30/Sep/19 17:43 Start Date: 30/Sep/19 17:43 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #9693: [BEAM-8324] Restrict the upper bound for dill due to incompatibility between versions 0.3.0 and 0.3.1.1. URL: https://github.com/apache/beam/pull/9693 Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py
[jira] [Work logged] (BEAM-7223) Add ValidatesRunner test suite for Flink on Python 3.
[ https://issues.apache.org/jira/browse/BEAM-7223?focusedWorklogId=320684&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320684 ] ASF GitHub Bot logged work on BEAM-7223: Author: ASF GitHub Bot Created on: 30/Sep/19 17:14 Start Date: 30/Sep/19 17:14 Worklog Time Spent: 10m Work Description: tvalentyn commented on issue #9691: [BEAM-7223] Add Python 3.5 Flink ValidatesRunner postcommit suite. URL: https://github.com/apache/beam/pull/9691#issuecomment-536659352 Run Python 3.5 Flink ValidatesRunner This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320684) Time Spent: 3h 40m (was: 3.5h) > Add ValidatesRunner test suite for Flink on Python 3. > - > > Key: BEAM-7223 > URL: https://issues.apache.org/jira/browse/BEAM-7223 > Project: Beam > Issue Type: Sub-task > Components: runner-flink >Reporter: Ankur Goenka >Assignee: Valentyn Tymofieiev >Priority: Major > Fix For: 2.17.0 > > Time Spent: 3h 40m > Remaining Estimate: 0h > > Add py3 integration tests for Flink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (BEAM-8314) Beam Fn Api metrics piling causes pipeline to stuck after running for a while
[ https://issues.apache.org/jira/browse/BEAM-8314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yichi Zhang closed BEAM-8314. - Resolution: Fixed > Beam Fn Api metrics piling causes pipeline to stuck after running for a while > - > > Key: BEAM-8314 > URL: https://issues.apache.org/jira/browse/BEAM-8314 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Yichi Zhang >Priority: Blocker > Fix For: 2.16.0 > > Attachments: E4UaSUhJJKF.png > > Time Spent: 5h 20m > Remaining Estimate: 0h > > Seems that in StreamingDataflowWorker we are not able to update the metrics > fast enough to dataflow service, the piling metrics causes memory usage to > increase and eventually leads to excessive memory thrashing/GC. And it will > almost stop the pipeline from processing new items. > > !E4UaSUhJJKF.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-7919) Add a Python 3 test scenario for MongoDB IO
[ https://issues.apache.org/jira/browse/BEAM-7919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yichi Zhang resolved BEAM-7919. --- Fix Version/s: 2.17.0 Resolution: Fixed > Add a Python 3 test scenario for MongoDB IO > --- > > Key: BEAM-7919 > URL: https://issues.apache.org/jira/browse/BEAM-7919 > Project: Beam > Issue Type: Sub-task > Components: io-ideas >Reporter: Valentyn Tymofieiev >Assignee: Yichi Zhang >Priority: Major > Fix For: 2.17.0 > > Time Spent: 2h 40m > Remaining Estimate: 0h > > Python 2 MongoDB IO suite was added in: > https://github.com/apache/beam/commit/17bf89d6070565b715f44ecb5f6394219b94cfe6 > We should also exercise this IO in Python 3. > cc: [~chamikara] [~altay] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7223) Add ValidatesRunner test suite for Flink on Python 3.
[ https://issues.apache.org/jira/browse/BEAM-7223?focusedWorklogId=320681&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320681 ] ASF GitHub Bot logged work on BEAM-7223: Author: ASF GitHub Bot Created on: 30/Sep/19 17:11 Start Date: 30/Sep/19 17:11 Worklog Time Spent: 10m Work Description: tvalentyn commented on issue #9691: [BEAM-7223] Add Python 3.5 Flink ValidatesRunner postcommit suite. URL: https://github.com/apache/beam/pull/9691#issuecomment-536658374 Run Python 3.5 ValidatesRunner Flink This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320681) Time Spent: 3.5h (was: 3h 20m) > Add ValidatesRunner test suite for Flink on Python 3. > - > > Key: BEAM-7223 > URL: https://issues.apache.org/jira/browse/BEAM-7223 > Project: Beam > Issue Type: Sub-task > Components: runner-flink >Reporter: Ankur Goenka >Assignee: Valentyn Tymofieiev >Priority: Major > Fix For: 2.17.0 > > Time Spent: 3.5h > Remaining Estimate: 0h > > Add py3 integration tests for Flink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (BEAM-7919) Add a Python 3 test scenario for MongoDB IO
[ https://issues.apache.org/jira/browse/BEAM-7919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on BEAM-7919 started by Yichi Zhang. - > Add a Python 3 test scenario for MongoDB IO > --- > > Key: BEAM-7919 > URL: https://issues.apache.org/jira/browse/BEAM-7919 > Project: Beam > Issue Type: Sub-task > Components: io-ideas >Reporter: Valentyn Tymofieiev >Assignee: Yichi Zhang >Priority: Major > Time Spent: 2h 40m > Remaining Estimate: 0h > > Python 2 MongoDB IO suite was added in: > https://github.com/apache/beam/commit/17bf89d6070565b715f44ecb5f6394219b94cfe6 > We should also exercise this IO in Python 3. > cc: [~chamikara] [~altay] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given
[ https://issues.apache.org/jira/browse/BEAM-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Valentyn Tymofieiev updated BEAM-8324: -- Status: Open (was: Triage Needed) > Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 > positional arguments but 7 were given > - > > Key: BEAM-8324 > URL: https://issues.apache.org/jira/browse/BEAM-8324 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Major > Fix For: 2.16.0 > > > {noformat} > Dataflow pipeline failed. State: FAILED, Error: > Traceback (most recent call last): > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 773, in run > self._load_main_session(self.local_staging_directory) > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 489, in _load_main_session > pickler.load_session(session_file) > File > "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", > line 287, in load_session > return dill.load_session(file_path) > File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in > load_session > module = unpickler.load() > TypeError: _create_function() takes from 2 to 6 positional arguments but 7 > were given > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given
[ https://issues.apache.org/jira/browse/BEAM-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941147#comment-16941147 ] Valentyn Tymofieiev commented on BEAM-8324: --- I also checked that 2.15.0 release was not affected by this. > Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 > positional arguments but 7 were given > - > > Key: BEAM-8324 > URL: https://issues.apache.org/jira/browse/BEAM-8324 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Major > Fix For: 2.16.0 > > > {noformat} > Dataflow pipeline failed. State: FAILED, Error: > Traceback (most recent call last): > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 773, in run > self._load_main_session(self.local_staging_directory) > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 489, in _load_main_session > pickler.load_session(session_file) > File > "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", > line 287, in load_session > return dill.load_session(file_path) > File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in > load_session > module = unpickler.load() > TypeError: _create_function() takes from 2 to 6 positional arguments but 7 > were given > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given
[ https://issues.apache.org/jira/browse/BEAM-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941146#comment-16941146 ] Valentyn Tymofieiev commented on BEAM-8324: --- I think we should restore the upper bound for dill, since it may be difficult to avoid this breakages unless dill has compatibility guarantees between versions. > Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 > positional arguments but 7 were given > - > > Key: BEAM-8324 > URL: https://issues.apache.org/jira/browse/BEAM-8324 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Major > Fix For: 2.16.0 > > > {noformat} > Dataflow pipeline failed. State: FAILED, Error: > Traceback (most recent call last): > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 773, in run > self._load_main_session(self.local_staging_directory) > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 489, in _load_main_session > pickler.load_session(session_file) > File > "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", > line 287, in load_session > return dill.load_session(file_path) > File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in > load_session > module = unpickler.load() > TypeError: _create_function() takes from 2 to 6 positional arguments but 7 > were given > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7389) Colab examples for element-wise transforms (Python)
[ https://issues.apache.org/jira/browse/BEAM-7389?focusedWorklogId=320675&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320675 ] ASF GitHub Bot logged work on BEAM-7389: Author: ASF GitHub Bot Created on: 30/Sep/19 17:04 Start Date: 30/Sep/19 17:04 Worklog Time Spent: 10m Work Description: davidcavazos commented on issue #9664: [BEAM-7389] Created code files to match doc filenames URL: https://github.com/apache/beam/pull/9664#issuecomment-536655440 Updated docs and obsolete files deleted in #9692, but they won't render correctly until this one is merged. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 320675) Time Spent: 61.5h (was: 61h 20m) > Colab examples for element-wise transforms (Python) > --- > > Key: BEAM-7389 > URL: https://issues.apache.org/jira/browse/BEAM-7389 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Rose Nguyen >Assignee: David Cavazos >Priority: Minor > Time Spent: 61.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-6896) Beam Dependency Update Request: PyYAML
[ https://issues.apache.org/jira/browse/BEAM-6896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yifan Zou updated BEAM-6896: Fix Version/s: (was: Not applicable) 2.17.0 > Beam Dependency Update Request: PyYAML > -- > > Key: BEAM-6896 > URL: https://issues.apache.org/jira/browse/BEAM-6896 > Project: Beam > Issue Type: Bug > Components: dependencies >Reporter: Beam JIRA Bot >Assignee: Robert Bradshaw >Priority: Major > Fix For: 2.17.0 > > > - 2019-03-25 04:17:47.501359 > - > Please consider upgrading the dependency PyYAML. > The current version is 3.13. The latest version is 5.1 > cc: > Please refer to [Beam Dependency Guide > |https://beam.apache.org/contribute/dependencies/]for more information. > Do Not Modify The Description Above. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (BEAM-8326) Run Portable_Python PreCommit fails with pickling
[ https://issues.apache.org/jira/browse/BEAM-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik closed BEAM-8326. --- Fix Version/s: Not applicable Resolution: Duplicate > Run Portable_Python PreCommit fails with pickling > - > > Key: BEAM-8326 > URL: https://issues.apache.org/jira/browse/BEAM-8326 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Ahmet Altay >Assignee: Valentyn Tymofieiev >Priority: Major > Fix For: Not applicable > > > Test log: > https://builds.apache.org/job/beam_PreCommit_Portable_Python_Phrase/434/console > I see the following error: > 09:31:34 Traceback (most recent call last): > 09:31:34 File > "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", > line 261, in loads > 09:31:34 return dill.loads(s) > 09:31:34 File "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line > 317, in loads > 09:31:34 return load(file, ignore) > 09:31:34 File "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line > 305, in load > 09:31:34 obj = pik.load() > 09:31:34 TypeError: _create_function() takes from 2 to 6 positional arguments > but 7 were given > 09:31:34 > 09:31:34 During handling of the above exception, another exception occurred: > 09:31:34 > 09:31:34 Traceback (most recent call last): > 09:31:34 File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 158, in _execute > 09:31:34 response = task() > 09:31:34 File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 191, in > 09:31:34 self._execute(lambda: worker.do_instruction(work), work) > 09:31:34 File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 343, in do_instruction > 09:31:34 request.instruction_id) > 09:31:34 File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 363, in process_bundle > 09:31:34 instruction_id, request.process_bundle_descriptor_id) > 09:31:34 File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 306, in get > 09:31:34 self.data_channel_factory) > 09:31:34 File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 580, in __init__ > 09:31:34 self.ops = > self.create_execution_tree(self.process_bundle_descriptor) > 09:31:34 File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 624, in create_execution_tree > 09:31:34 descriptor.transforms, key=topological_height, reverse=True)]) > 09:31:34 File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 623, in > 09:31:34 for transform_id in sorted( > 09:31:34 File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 548, in wrapper > 09:31:34 result = cache[args] = func(*args) > 09:31:34 File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 607, in get_operation > 09:31:34 in descriptor.transforms[transform_id].outputs.items() > 09:31:34 File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 606, in > 09:31:34 for tag, pcoll_id > 09:31:34 File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 605, in > 09:31:34 tag: [get_operation(op) for op in pcoll_consumers[pcoll_id]] > 09:31:34 File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 548, in wrapper > 09:31:34 result = cache[args] = func(*args) > 09:31:34 File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 610, in get_operation > 09:31:34 transform_id, transform_consumers) > 09:31:34 File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 869, in create_operation > 09:31:34 return creator(self, transform_id, transform_proto, payload, > consumers) > 09:31:34 File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 1112, in create > 09:31:34 serialized_fn, parameter) > 09:31:34 File > "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 1150, in _create_pardo_operation > 09:31:34 dofn_data = pickler.loads(serialized_fn) > 09:31:34 File > "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", > line 265, in loads > 09:31:34 return dill.loads(s) > 09:31:34 File "/usr/local/lib/python3.5/
[jira] [Comment Edited] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given
[ https://issues.apache.org/jira/browse/BEAM-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941140#comment-16941140 ] Valentyn Tymofieiev edited comment on BEAM-8324 at 9/30/19 5:01 PM: I think the cause of the error is that we pickle main session using dill==0.3.1.1, and unpickle it using dill==0.3.0 (which is installed on DF workers). We can update the version of Dill in DF workers, or restrict the upperbound for dill, which may be advisable given [~yoshiki.obata]'s report in https://issues.apache.org/jira/browse/BEAM-5878?focusedCommentId=16941114&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16941114. was (Author: tvalentyn): I think the cause of the error is that we pickle main session using dill==0.3.1.1, and unpickle it using dill==0.2.9 (which is installed on DF workers). We can update the version of Dill in DF workers, or restrict the upperbound for dill, which may be advisable given [~yoshiki.obata]'s report in https://issues.apache.org/jira/browse/BEAM-5878?focusedCommentId=16941114&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16941114. > Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 > positional arguments but 7 were given > - > > Key: BEAM-8324 > URL: https://issues.apache.org/jira/browse/BEAM-8324 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Major > Fix For: 2.16.0 > > > {noformat} > Dataflow pipeline failed. State: FAILED, Error: > Traceback (most recent call last): > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 773, in run > self._load_main_session(self.local_staging_directory) > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 489, in _load_main_session > pickler.load_session(session_file) > File > "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", > line 287, in load_session > return dill.load_session(file_path) > File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in > load_session > module = unpickler.load() > TypeError: _create_function() takes from 2 to 6 positional arguments but 7 > were given > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given
[ https://issues.apache.org/jira/browse/BEAM-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941140#comment-16941140 ] Valentyn Tymofieiev commented on BEAM-8324: --- I think the cause of the error is that we pickle main session using dill==0.3.1.1, and unpickle it using dill==0.2.9 (which is installed on DF workers). We can update the version of Dill in DF workers, or restrict the upperbound for dill, which may be advisable given [~yoshiki.obata]'s report in https://issues.apache.org/jira/browse/BEAM-5878?focusedCommentId=16941114&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16941114. > Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 > positional arguments but 7 were given > - > > Key: BEAM-8324 > URL: https://issues.apache.org/jira/browse/BEAM-8324 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Major > Fix For: 2.16.0 > > > {noformat} > Dataflow pipeline failed. State: FAILED, Error: > Traceback (most recent call last): > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 773, in run > self._load_main_session(self.local_staging_directory) > File > "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line > 489, in _load_main_session > pickler.load_session(session_file) > File > "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", > line 287, in load_session > return dill.load_session(file_path) > File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in > load_session > module = unpickler.load() > TypeError: _create_function() takes from 2 to 6 positional arguments but 7 > were given > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7389) Colab examples for element-wise transforms (Python)
[ https://issues.apache.org/jira/browse/BEAM-7389?focusedWorklogId=320672&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320672 ] ASF GitHub Bot logged work on BEAM-7389: Author: ASF GitHub Bot Created on: 30/Sep/19 17:01 Start Date: 30/Sep/19 17:01 Worklog Time Spent: 10m Work Description: davidcavazos commented on pull request #9692: [BEAM-7389] Update docs with matching code files URL: https://github.com/apache/beam/pull/9692 **Merge after #9664** Second phase to update code filenames to match the docs filenames. * Updates the docs * Deletes obsolete code files TODO: * Wait until #9664 merges so new code files are available * Regenerate the `flatmap` notebook > Staged files will be broken until the new code files are merged FlatMap: http://apache-beam-website-pull-requests.storage.googleapis.com/9669/documentation/transforms/python/elementwise/flatmap/index.html ToString: http://apache-beam-website-pull-requests.storage.googleapis.com/9669/documentation/transforms/python/elementwise/tostring/index.html WithTimestamps: http://apache-beam-website-pull-requests.storage.googleapis.com/9669/documentation/transforms/python/elementwise/withtimestamps/index.html Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job