[jira] [Work logged] (BEAM-6855) Side inputs are not supported when using the state API

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6855?focusedWorklogId=321060=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321060
 ]

ASF GitHub Bot logged work on BEAM-6855:


Author: ASF GitHub Bot
Created on: 01/Oct/19 05:50
Start Date: 01/Oct/19 05:50
Worklog Time Spent: 10m 
  Work Description: salmanVD commented on issue #9612: [BEAM-6855] Side 
inputs are not supported when using the state API
URL: https://github.com/apache/beam/pull/9612#issuecomment-536877287
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 321060)
Time Spent: 6h 20m  (was: 6h 10m)

> Side inputs are not supported when using the state API
> --
>
> Key: BEAM-6855
> URL: https://issues.apache.org/jira/browse/BEAM-6855
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core, runner-dataflow, runner-direct
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-6855) Side inputs are not supported when using the state API

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6855?focusedWorklogId=321059=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321059
 ]

ASF GitHub Bot logged work on BEAM-6855:


Author: ASF GitHub Bot
Created on: 01/Oct/19 05:23
Start Date: 01/Oct/19 05:23
Worklog Time Spent: 10m 
  Work Description: salmanVD commented on issue #9612: [BEAM-6855] Side 
inputs are not supported when using the state API
URL: https://github.com/apache/beam/pull/9612#issuecomment-536870195
 
 
   Run Portable_Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 321059)
Time Spent: 6h 10m  (was: 6h)

> Side inputs are not supported when using the state API
> --
>
> Key: BEAM-6855
> URL: https://issues.apache.org/jira/browse/BEAM-6855
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core, runner-dataflow, runner-direct
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-5878) Support DoFns with Keyword-only arguments in Python 3.

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5878?focusedWorklogId=321025=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321025
 ]

ASF GitHub Bot logged work on BEAM-5878:


Author: ASF GitHub Bot
Created on: 01/Oct/19 02:04
Start Date: 01/Oct/19 02:04
Worklog Time Spent: 10m 
  Work Description: lazylynx commented on pull request #9686: 
[WIP][BEAM-5878] update dill min version to 0.3.1.1 and add test for functions 
with Keyword-only arguments
URL: https://github.com/apache/beam/pull/9686
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 321025)
Time Spent: 15h  (was: 14h 50m)

> Support DoFns with Keyword-only arguments in Python 3.
> --
>
> Key: BEAM-5878
> URL: https://issues.apache.org/jira/browse/BEAM-5878
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: yoshiki obata
>Priority: Minor
> Fix For: 2.17.0
>
>  Time Spent: 15h
>  Remaining Estimate: 0h
>
> Python 3.0 [adds a possibility|https://www.python.org/dev/peps/pep-3102/] to 
> define functions with keyword-only arguments. 
> Currently Beam does not handle them correctly. [~ruoyu] pointed out [one 
> place|https://github.com/apache/beam/blob/a56ce43109c97c739fa08adca45528c41e3c925c/sdks/python/apache_beam/typehints/decorators.py#L118]
>  in our codebase that we should fix: in Python in 3.0 inspect.getargspec() 
> will fail on functions with keyword-only arguments, but a new method 
> [inspect.getfullargspec()|https://docs.python.org/3/library/inspect.html#inspect.getfullargspec]
>  supports them.
> There may be implications for our (best-effort) type-hints machinery.
> We should also add a Py3-only unit tests that covers DoFn's with keyword-only 
> arguments once Beam Python 3 tests are in a good shape.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-5878) Support DoFns with Keyword-only arguments in Python 3.

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5878?focusedWorklogId=321024=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321024
 ]

ASF GitHub Bot logged work on BEAM-5878:


Author: ASF GitHub Bot
Created on: 01/Oct/19 02:04
Start Date: 01/Oct/19 02:04
Worklog Time Spent: 10m 
  Work Description: lazylynx commented on issue #9686: [WIP][BEAM-5878] 
update dill min version to 0.3.1.1 and add test for functions with Keyword-only 
arguments
URL: https://github.com/apache/beam/pull/9686#issuecomment-536826908
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 321024)
Time Spent: 14h 50m  (was: 14h 40m)

> Support DoFns with Keyword-only arguments in Python 3.
> --
>
> Key: BEAM-5878
> URL: https://issues.apache.org/jira/browse/BEAM-5878
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: yoshiki obata
>Priority: Minor
> Fix For: 2.17.0
>
>  Time Spent: 14h 50m
>  Remaining Estimate: 0h
>
> Python 3.0 [adds a possibility|https://www.python.org/dev/peps/pep-3102/] to 
> define functions with keyword-only arguments. 
> Currently Beam does not handle them correctly. [~ruoyu] pointed out [one 
> place|https://github.com/apache/beam/blob/a56ce43109c97c739fa08adca45528c41e3c925c/sdks/python/apache_beam/typehints/decorators.py#L118]
>  in our codebase that we should fix: in Python in 3.0 inspect.getargspec() 
> will fail on functions with keyword-only arguments, but a new method 
> [inspect.getfullargspec()|https://docs.python.org/3/library/inspect.html#inspect.getfullargspec]
>  supports them.
> There may be implications for our (best-effort) type-hints machinery.
> We should also add a Py3-only unit tests that covers DoFn's with keyword-only 
> arguments once Beam Python 3 tests are in a good shape.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7389) Colab examples for element-wise transforms (Python)

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7389?focusedWorklogId=320984=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320984
 ]

ASF GitHub Bot logged work on BEAM-7389:


Author: ASF GitHub Bot
Created on: 01/Oct/19 00:06
Start Date: 01/Oct/19 00:06
Worklog Time Spent: 10m 
  Work Description: davidcavazos commented on issue #9669: [BEAM-7389] 
Update include buttons to support multiple languages
URL: https://github.com/apache/beam/pull/9669#issuecomment-536801256
 
 
   I'm actually regularizing some names in #9664 alongside #9692. Basically we 
would need to rename everything to `elementwise` since `element-wise` is not 
permitted in Python, that's why it ended up being `element_wise`. In java we'll 
also have a similar problem unless we go for `elementwise`.
   
   That would imply moving/renaming all the website source files as well as 
Python source files to read the new directory names. That wouldn't cause any 
problems if done in two stages as the PRs I mentioned above. It would just be 
two large PRs even if they don't change any of the content. I like the idea 
since I would like to keep things consistent.
   
   Just to make things clear, here is what would have to be done:
   
   1. PR1: Create new `elementwise` directory.
  * **Copy** over the `element_wise` directory into `elementwise`.
  * Make sure all tests pass
  * We would have (temporarily) two copies of the code snippets.
   
   1. PR2: Update the docs.
  * **Move** the docs from `element-wise` to `elementwise`.
  * **Update** all code snippets to point to the new `elementwise` Python 
directory.
  * Regenerate notebooks
  * **Delete** old `element_wise` Python directory.
   
   These would be big PRs even if there is no actual content change.
   
   @aaltay How about we do these changes in the open PRs we already have for 
this?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320984)
Time Spent: 61h 50m  (was: 61h 40m)

> Colab examples for element-wise transforms (Python)
> ---
>
> Key: BEAM-7389
> URL: https://issues.apache.org/jira/browse/BEAM-7389
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Rose Nguyen
>Assignee: David Cavazos
>Priority: Minor
>  Time Spent: 61h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7389) Colab examples for element-wise transforms (Python)

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7389?focusedWorklogId=320981=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320981
 ]

ASF GitHub Bot logged work on BEAM-7389:


Author: ASF GitHub Bot
Created on: 01/Oct/19 00:04
Start Date: 01/Oct/19 00:04
Worklog Time Spent: 10m 
  Work Description: davidcavazos commented on issue #9669: [BEAM-7389] 
Update include buttons to support multiple languages
URL: https://github.com/apache/beam/pull/9669#issuecomment-536801256
 
 
   I'm actually regularizing some names in #9664 alongside #9692. Basically we 
would need to rename everything to `elementwise` since `element-wise` is not 
permitted in Python, that's why it ended up being `element_wise`. In java we'll 
also have a similar problem unless we go for `elementwise`.
   
   That would imply moving/renaming all the website source files as well as 
Python source files to read the new directory names. That wouldn't cause any 
problems if done in two stages as the PRs I mentioned above. It would just be 
two large PRs even if they don't change any of the content. I like the idea 
since I would like to keep things consistent.
   
   Just to make things clear, here is what would have to be done:
   
   1. PR1: Create new `elementwise` directory.
  * **Copy** over the `element_wise` directory into `elementwise`.
  * Make sure all tests pass
  * We would have (temporarily) two copies of the code snippets.
   
   1. PR2: Update the docs.
  * **Move** the docs from `element-wise` to `elementwise`.
  * Update all code snippets to point to the new `elementwise` Python 
directory.
  * Regenerate notebooks
  * Delete old `element_wise` Python directory.
   
   These would be big PRs even if there is no actual content change.
   
   @aaltay How about we do these changes in the open PRs we already have for 
this?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320981)
Time Spent: 61h 40m  (was: 61.5h)

> Colab examples for element-wise transforms (Python)
> ---
>
> Key: BEAM-7389
> URL: https://issues.apache.org/jira/browse/BEAM-7389
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Rose Nguyen
>Assignee: David Cavazos
>Priority: Minor
>  Time Spent: 61h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-7049) Merge multiple input to one BeamUnionRel

2019-09-30 Thread sridhar Reddy (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941410#comment-16941410
 ] 

sridhar Reddy commented on BEAM-7049:
-

[~amaliujia] I just want to make sure that we are on the same page as far as 
the next steps are concerned.  At this point, I am waiting for your 
instructions on how to proceed further: How to split UNION ALL and UNION and 
what aspects of SQL Optimizer should be I looking at. 

    Please take your time to update the Jira. I just want to make sure we are 
not waiting for each other.

 

> Merge multiple input to one BeamUnionRel
> 
>
> Key: BEAM-7049
> URL: https://issues.apache.org/jira/browse/BEAM-7049
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: sridhar Reddy
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> BeamUnionRel assumes inputs are two and rejects more. So `a UNION b UNION c` 
> will have to be created as UNION(a, UNION(b, c)) and have two shuffles. If 
> BeamUnionRel can handle multiple shuffles, we will have only one shuffle



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8303) Filesystems not properly registered using FileIO.write()

2019-09-30 Thread Mark Liu (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941407#comment-16941407
 ] 

Mark Liu commented on BEAM-8303:


Looks like this issue happened in 2.15.0, thus it's not a regression for 
2.16.0. According to the policy in 
https://beam.apache.org/contribute/release-guide/#4-triage-release-blocking-issues-in-jira,
 it's not necessary a blocker for 2.16.0. I'll move this issue to 2.17.0 but if 
it's possible, we can still back port it to 2.16.0.

> Filesystems not properly registered using FileIO.write()
> 
>
> Key: BEAM-8303
> URL: https://issues.apache.org/jira/browse/BEAM-8303
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.15.0
>Reporter: Preston Koprivica
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.16.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> I’m getting the following error when attempting to use the FileIO apis 
> (beam-2.15.0) and integrating with AWS S3.  I have setup the PipelineOptions 
> with all the relevant AWS options, so the filesystem registry **should** be 
> properly seeded by the time the graph is compiled and executed:
> {code:java}
>  java.lang.IllegalArgumentException: No filesystem found for scheme s3
>     at 
> org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.java:456)
>     at 
> org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:526)
>     at 
> org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1149)
>     at 
> org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1105)
>     at org.apache.beam.sdk.coders.Coder.decode(Coder.java:159)
>     at 
> org.apache.beam.sdk.transforms.join.UnionCoder.decode(UnionCoder.java:83)
>     at 
> org.apache.beam.sdk.transforms.join.UnionCoder.decode(UnionCoder.java:32)
>     at 
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:543)
>     at 
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:534)
>     at 
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:480)
>     at 
> org.apache.beam.runners.flink.translation.types.CoderTypeSerializer.deserialize(CoderTypeSerializer.java:93)
>     at 
> org.apache.flink.runtime.plugable.NonReusingDeserializationDelegate.read(NonReusingDeserializationDelegate.java:55)
>     at 
> org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.getNextRecord(SpillingAdaptiveSpanningRecordDeserializer.java:106)
>     at 
> org.apache.flink.runtime.io.network.api.reader.AbstractRecordReader.getNextRecord(AbstractRecordReader.java:72)
>     at 
> org.apache.flink.runtime.io.network.api.reader.MutableRecordReader.next(MutableRecordReader.java:47)
>     at 
> org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:73)
>     at 
> org.apache.flink.runtime.operators.FlatMapDriver.run(FlatMapDriver.java:107)
>     at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:503)
>     at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
>     at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
>     at java.lang.Thread.run(Thread.java:748)
>  {code}
> For reference, the write code resembles this:
> {code:java}
>  FileIO.Write write = FileIO.write()
>     .via(ParquetIO.sink(schema))
>     .to(options.getOutputDir()). // will be something like: 
> s3:///
>     .withSuffix(".parquet");
> records.apply(String.format("Write(%s)", options.getOutputDir()), 
> write);{code}
> The issue does not appear to be related to ParquetIO.sink().  I am able to 
> reliably reproduce the issue using JSON formatted records and TextIO.sink(), 
> as well.  Moreover, AvroIO is affected if withWindowedWrites() option is 
> added.
> Just trying some different knobs, I went ahead and set the following option:
> {code:java}
> write = write.withNoSpilling();{code}
> This actually seemed to fix the issue, only to have it reemerge as I scaled 
> up the data set size.  The stack trace, while very similar, reads:
> {code:java}
>  java.lang.IllegalArgumentException: No filesystem found for scheme s3
>     at 
> org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.java:456)
>     at 
> org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:526)
>     at 
> org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1149)
>     at 
> org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1105)
>     at org.apache.beam.sdk.coders.Coder.decode(Coder.java:159)
>     at 

[jira] [Updated] (BEAM-8303) Filesystems not properly registered using FileIO.write()

2019-09-30 Thread Mark Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu updated BEAM-8303:
---
Fix Version/s: (was: 2.16.0)
   2.17.0

> Filesystems not properly registered using FileIO.write()
> 
>
> Key: BEAM-8303
> URL: https://issues.apache.org/jira/browse/BEAM-8303
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.15.0
>Reporter: Preston Koprivica
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.17.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> I’m getting the following error when attempting to use the FileIO apis 
> (beam-2.15.0) and integrating with AWS S3.  I have setup the PipelineOptions 
> with all the relevant AWS options, so the filesystem registry **should** be 
> properly seeded by the time the graph is compiled and executed:
> {code:java}
>  java.lang.IllegalArgumentException: No filesystem found for scheme s3
>     at 
> org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.java:456)
>     at 
> org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:526)
>     at 
> org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1149)
>     at 
> org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1105)
>     at org.apache.beam.sdk.coders.Coder.decode(Coder.java:159)
>     at 
> org.apache.beam.sdk.transforms.join.UnionCoder.decode(UnionCoder.java:83)
>     at 
> org.apache.beam.sdk.transforms.join.UnionCoder.decode(UnionCoder.java:32)
>     at 
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:543)
>     at 
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:534)
>     at 
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:480)
>     at 
> org.apache.beam.runners.flink.translation.types.CoderTypeSerializer.deserialize(CoderTypeSerializer.java:93)
>     at 
> org.apache.flink.runtime.plugable.NonReusingDeserializationDelegate.read(NonReusingDeserializationDelegate.java:55)
>     at 
> org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.getNextRecord(SpillingAdaptiveSpanningRecordDeserializer.java:106)
>     at 
> org.apache.flink.runtime.io.network.api.reader.AbstractRecordReader.getNextRecord(AbstractRecordReader.java:72)
>     at 
> org.apache.flink.runtime.io.network.api.reader.MutableRecordReader.next(MutableRecordReader.java:47)
>     at 
> org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:73)
>     at 
> org.apache.flink.runtime.operators.FlatMapDriver.run(FlatMapDriver.java:107)
>     at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:503)
>     at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
>     at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
>     at java.lang.Thread.run(Thread.java:748)
>  {code}
> For reference, the write code resembles this:
> {code:java}
>  FileIO.Write write = FileIO.write()
>     .via(ParquetIO.sink(schema))
>     .to(options.getOutputDir()). // will be something like: 
> s3:///
>     .withSuffix(".parquet");
> records.apply(String.format("Write(%s)", options.getOutputDir()), 
> write);{code}
> The issue does not appear to be related to ParquetIO.sink().  I am able to 
> reliably reproduce the issue using JSON formatted records and TextIO.sink(), 
> as well.  Moreover, AvroIO is affected if withWindowedWrites() option is 
> added.
> Just trying some different knobs, I went ahead and set the following option:
> {code:java}
> write = write.withNoSpilling();{code}
> This actually seemed to fix the issue, only to have it reemerge as I scaled 
> up the data set size.  The stack trace, while very similar, reads:
> {code:java}
>  java.lang.IllegalArgumentException: No filesystem found for scheme s3
>     at 
> org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.java:456)
>     at 
> org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:526)
>     at 
> org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1149)
>     at 
> org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1105)
>     at org.apache.beam.sdk.coders.Coder.decode(Coder.java:159)
>     at org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:82)
>     at org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:36)
>     at 
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:543)
>     at 
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:534)
>     

[jira] [Created] (BEAM-8331) Vendored calcite breaks if another calcite is on the class path

2019-09-30 Thread Andrew Pilloud (Jira)
Andrew Pilloud created BEAM-8331:


 Summary: Vendored calcite breaks if another calcite is on the 
class path
 Key: BEAM-8331
 URL: https://issues.apache.org/jira/browse/BEAM-8331
 Project: Beam
  Issue Type: Bug
  Components: dsl-sql
Affects Versions: 2.15.0
Reporter: Andrew Pilloud


If the beam vendored calcite and a non-vendored calcite are both on the 
classpath, neither version works. This is because the non-JDBC calcite path 
uses JDBC as a easy way to perform reflection. (This affects the non-JDBC 
version of calcite.) We need to rewrite the calcite JDBC urls as part of our 
vendoring (for example 'jdbc:calcite:' to 'jdbc:beam-vendor-calcite:'). Example 
of where this happens: 
[https://github.com/apache/calcite/blob/0cce229903a845a7b8ed36cf86d6078fd82d73d3/core/src/main/java/org/apache/calcite/tools/Frameworks.java#L175]

 
{code:java}
java.lang.RuntimeException: java.lang.RuntimeException: Property 
'org.apache.beam.sdk.extensions.sql.impl.planner.BeamRelDataTypeSystem' not 
valid for plugin type org.apache.calcite.rel.type.RelDataTypeSystem
at 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:160)
at 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:115)
at 
org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLPlannerImpl.(ZetaSQLPlannerImpl.java:86)
at 
org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner.(ZetaSQLQueryPlanner.java:55){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8312) Flink portable pipeline jars do not need to stage artifacts remotely

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8312?focusedWorklogId=320968=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320968
 ]

ASF GitHub Bot logged work on BEAM-8312:


Author: ASF GitHub Bot
Created on: 30/Sep/19 23:07
Start Date: 30/Sep/19 23:07
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #9681: [BEAM-8312] 
ArtifactRetrievalService serving artifacts from jar.
URL: https://github.com/apache/beam/pull/9681#discussion_r329823135
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/ClassLoaderArtifactRetrievalService.java
 ##
 @@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import java.io.IOException;
+import java.io.InputStream;
+
+/**
+ * An {@link ArtifactRetrievalService} that loads artifacts as {@link 
ClassLoader} resources.
+ *
+ * The retrieval token should be a path to a JSON-formatted ProxyManifest 
accessible vai {@link
+ * ClassLoader#getResource(String)} whose resource locations also point to 
paths loadable via {@link
+ * ClassLoader#getResource(String)}.
+ */
+public class ClassLoaderArtifactRetrievalService extends 
AbstractArtifactRetrievalService {
+
+  private final ClassLoader classLoader;
+
+  public ClassLoaderArtifactRetrievalService() {
+this(ClassLoaderArtifactRetrievalService.class.getClassLoader());
+  }
+
+  public ClassLoaderArtifactRetrievalService(ClassLoader classLoader) {
+this.classLoader = classLoader;
+  }
+
+  @Override
+  public InputStream openManifest(String retrievalToken) throws IOException {
+return openUri(retrievalToken, retrievalToken);
+  }
+
+  @Override
+  public InputStream openUri(String retrievalToken, String uri) throws 
IOException {
+if (uri.charAt(0) == '/') {
 
 Review comment:
   I always had to use absolute paths when loading from the classpath in a jar. 
Although I guess that might depend on which class we're loading from?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320968)
Time Spent: 1.5h  (was: 1h 20m)

> Flink portable pipeline jars do not need to stage artifacts remotely
> 
>
> Key: BEAM-8312
> URL: https://issues.apache.org/jira/browse/BEAM-8312
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Currently, Flink job jars re-stage all artifacts at runtime (on the 
> JobManager) by using the usual BeamFileSystemArtifactRetrievalService [1]. 
> However, since the manifest and all the artifacts live on the classpath of 
> the jar, and everything from the classpath is copied to the Flink workers 
> anyway [2], it should not be necessary to do additional artifact staging. We 
> could replace BeamFileSystemArtifactRetrievalService in this case with a 
> simple ArtifactRetrievalService that just pulls the artifacts from the 
> classpath.
>  
>  [1] 
> [https://github.com/apache/beam/blob/340c3202b1e5824b959f5f9f626e4c7c7842a3cb/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactRetrievalService.java]
> [2] 
> [https://github.com/apache/beam/blob/2f1b56ccc506054e40afe4793a8b556e872e1865/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkExecutionEnvironments.java#L93]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8312) Flink portable pipeline jars do not need to stage artifacts remotely

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8312?focusedWorklogId=320964=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320964
 ]

ASF GitHub Bot logged work on BEAM-8312:


Author: ASF GitHub Bot
Created on: 30/Sep/19 23:07
Start Date: 30/Sep/19 23:07
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #9681: [BEAM-8312] 
ArtifactRetrievalService serving artifacts from jar.
URL: https://github.com/apache/beam/pull/9681#discussion_r329821776
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/ClassLoaderArtifactRetrievalService.java
 ##
 @@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import java.io.IOException;
+import java.io.InputStream;
+
+/**
+ * An {@link ArtifactRetrievalService} that loads artifacts as {@link 
ClassLoader} resources.
+ *
+ * The retrieval token should be a path to a JSON-formatted ProxyManifest 
accessible vai {@link
 
 Review comment:
   via
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320964)
Time Spent: 1h 10m  (was: 1h)

> Flink portable pipeline jars do not need to stage artifacts remotely
> 
>
> Key: BEAM-8312
> URL: https://issues.apache.org/jira/browse/BEAM-8312
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Currently, Flink job jars re-stage all artifacts at runtime (on the 
> JobManager) by using the usual BeamFileSystemArtifactRetrievalService [1]. 
> However, since the manifest and all the artifacts live on the classpath of 
> the jar, and everything from the classpath is copied to the Flink workers 
> anyway [2], it should not be necessary to do additional artifact staging. We 
> could replace BeamFileSystemArtifactRetrievalService in this case with a 
> simple ArtifactRetrievalService that just pulls the artifacts from the 
> classpath.
>  
>  [1] 
> [https://github.com/apache/beam/blob/340c3202b1e5824b959f5f9f626e4c7c7842a3cb/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactRetrievalService.java]
> [2] 
> [https://github.com/apache/beam/blob/2f1b56ccc506054e40afe4793a8b556e872e1865/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkExecutionEnvironments.java#L93]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8312) Flink portable pipeline jars do not need to stage artifacts remotely

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8312?focusedWorklogId=320967=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320967
 ]

ASF GitHub Bot logged work on BEAM-8312:


Author: ASF GitHub Bot
Created on: 30/Sep/19 23:07
Start Date: 30/Sep/19 23:07
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #9681: [BEAM-8312] 
ArtifactRetrievalService serving artifacts from jar.
URL: https://github.com/apache/beam/pull/9681#discussion_r329820112
 
 

 ##
 File path: 
runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/artifact/ClassLoaderArtifactServiceTest.java
 ##
 @@ -0,0 +1,401 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.google.common.base.Charsets;
+import com.google.common.collect.ImmutableMap;
+import java.io.ByteArrayOutputStream;
+import java.io.FileOutputStream;
+import java.io.IOException;
+import java.net.URI;
+import java.net.URL;
+import java.net.URLClassLoader;
+import java.nio.charset.Charset;
+import java.nio.file.FileSystem;
+import java.nio.file.FileSystems;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.zip.ZipEntry;
+import java.util.zip.ZipOutputStream;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi;
+import org.apache.beam.model.jobmanagement.v1.ArtifactRetrievalServiceGrpc;
+import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc;
+import org.apache.beam.runners.fnexecution.GrpcFnServer;
+import org.apache.beam.runners.fnexecution.InProcessServerFactory;
+import org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.ByteString;
+import org.apache.beam.vendor.grpc.v1p21p0.io.grpc.ManagedChannel;
+import 
org.apache.beam.vendor.grpc.v1p21p0.io.grpc.inprocess.InProcessChannelBuilder;
+import org.apache.beam.vendor.grpc.v1p21p0.io.grpc.stub.StreamObserver;
+import org.junit.Assert;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/**
+ * Tests for {@link ClassLoaderArtifactRetrievalService} and {@link
+ * JavaFilesystemArtifactStagingService}.
+ */
+@RunWith(JUnit4.class)
+public class ClassLoaderArtifactServiceTest {
+
+  @Rule public TemporaryFolder tempFolder = new TemporaryFolder();
+
+  private static final int ARTIFACT_CHUNK_SIZE = 100;
+
+  private static final Charset BIJECTIVE_CHARSET = Charsets.ISO_8859_1;
+
+  public interface ArtifactServicePair extends AutoCloseable {
+
+public String getStagingToken(String nonce);
+
+public ArtifactStagingServiceGrpc.ArtifactStagingServiceStub 
createStagingStub()
+throws Exception;
+
+public ArtifactStagingServiceGrpc.ArtifactStagingServiceBlockingStub 
createStagingBlockingStub()
+throws Exception;
+
+public ArtifactRetrievalServiceGrpc.ArtifactRetrievalServiceStub 
createRetrievalStub()
+throws Exception;
+
+public ArtifactRetrievalServiceGrpc.ArtifactRetrievalServiceBlockingStub
+createRetrievalBlockingStub() throws Exception;
+  }
+
+  private ArtifactServicePair classLoaderService() throws IOException {
 
 Review comment:
   Since this is a sizable chunk of code for a test, it might be good to 
provide a comment explaining what this is for (ie emulating a `JarCreator`)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320967)
Time Spent: 1h 20m  (was: 1h 10m)

> Flink portable pipeline 

[jira] [Work logged] (BEAM-8312) Flink portable pipeline jars do not need to stage artifacts remotely

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8312?focusedWorklogId=320965=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320965
 ]

ASF GitHub Bot logged work on BEAM-8312:


Author: ASF GitHub Bot
Created on: 30/Sep/19 23:07
Start Date: 30/Sep/19 23:07
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #9681: [BEAM-8312] 
ArtifactRetrievalService serving artifacts from jar.
URL: https://github.com/apache/beam/pull/9681#discussion_r329814570
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/JavaFilesystemArtifactStagingService.java
 ##
 @@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.channels.Channels;
+import java.nio.channels.WritableByteChannel;
+import java.nio.file.FileSystem;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.util.Comparator;
+import java.util.stream.Stream;
+import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc;
+
+/**
+ * An {@link ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase} that 
loads artifacts into a
+ * Java FileSystem.
 
 Review comment:
   Could you `@link` the FileSystem class as well?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320965)
Time Spent: 1h 10m  (was: 1h)

> Flink portable pipeline jars do not need to stage artifacts remotely
> 
>
> Key: BEAM-8312
> URL: https://issues.apache.org/jira/browse/BEAM-8312
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Currently, Flink job jars re-stage all artifacts at runtime (on the 
> JobManager) by using the usual BeamFileSystemArtifactRetrievalService [1]. 
> However, since the manifest and all the artifacts live on the classpath of 
> the jar, and everything from the classpath is copied to the Flink workers 
> anyway [2], it should not be necessary to do additional artifact staging. We 
> could replace BeamFileSystemArtifactRetrievalService in this case with a 
> simple ArtifactRetrievalService that just pulls the artifacts from the 
> classpath.
>  
>  [1] 
> [https://github.com/apache/beam/blob/340c3202b1e5824b959f5f9f626e4c7c7842a3cb/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactRetrievalService.java]
> [2] 
> [https://github.com/apache/beam/blob/2f1b56ccc506054e40afe4793a8b556e872e1865/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkExecutionEnvironments.java#L93]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8312) Flink portable pipeline jars do not need to stage artifacts remotely

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8312?focusedWorklogId=320966=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320966
 ]

ASF GitHub Bot logged work on BEAM-8312:


Author: ASF GitHub Bot
Created on: 30/Sep/19 23:07
Start Date: 30/Sep/19 23:07
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #9681: [BEAM-8312] 
ArtifactRetrievalService serving artifacts from jar.
URL: https://github.com/apache/beam/pull/9681#discussion_r329814389
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/JavaFilesystemArtifactStagingService.java
 ##
 @@ -0,0 +1,92 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.channels.Channels;
+import java.nio.channels.WritableByteChannel;
+import java.nio.file.FileSystem;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.util.Comparator;
+import java.util.stream.Stream;
+import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc;
+
+/**
+ * An {@link ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase} that 
loads artifacts into a
+ * Java FileSystem.
+ */
+public class JavaFilesystemArtifactStagingService extends 
AbstractArtifactStagingService {
 
 Review comment:
   SGTM, I agree it's nice to decouple this logic from 
`PortablePipelineJarCreator`.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320966)
Time Spent: 1h 10m  (was: 1h)

> Flink portable pipeline jars do not need to stage artifacts remotely
> 
>
> Key: BEAM-8312
> URL: https://issues.apache.org/jira/browse/BEAM-8312
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Currently, Flink job jars re-stage all artifacts at runtime (on the 
> JobManager) by using the usual BeamFileSystemArtifactRetrievalService [1]. 
> However, since the manifest and all the artifacts live on the classpath of 
> the jar, and everything from the classpath is copied to the Flink workers 
> anyway [2], it should not be necessary to do additional artifact staging. We 
> could replace BeamFileSystemArtifactRetrievalService in this case with a 
> simple ArtifactRetrievalService that just pulls the artifacts from the 
> classpath.
>  
>  [1] 
> [https://github.com/apache/beam/blob/340c3202b1e5824b959f5f9f626e4c7c7842a3cb/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactRetrievalService.java]
> [2] 
> [https://github.com/apache/beam/blob/2f1b56ccc506054e40afe4793a8b556e872e1865/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkExecutionEnvironments.java#L93]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8303) Filesystems not properly registered using FileIO.write()

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8303?focusedWorklogId=320948=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320948
 ]

ASF GitHub Bot logged work on BEAM-8303:


Author: ASF GitHub Bot
Created on: 30/Sep/19 22:48
Start Date: 30/Sep/19 22:48
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on issue #9688: [BEAM-8303] 
Ensure FileSystems registration code runs in non UDF Flink operators
URL: https://github.com/apache/beam/pull/9688#issuecomment-536783742
 
 
   Fix to Portable_Python failure is merged in 
https://issues.apache.org/jira/browse/BEAM-8324.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320948)
Time Spent: 50m  (was: 40m)

> Filesystems not properly registered using FileIO.write()
> 
>
> Key: BEAM-8303
> URL: https://issues.apache.org/jira/browse/BEAM-8303
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.15.0
>Reporter: Preston Koprivica
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.16.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> I’m getting the following error when attempting to use the FileIO apis 
> (beam-2.15.0) and integrating with AWS S3.  I have setup the PipelineOptions 
> with all the relevant AWS options, so the filesystem registry **should** be 
> properly seeded by the time the graph is compiled and executed:
> {code:java}
>  java.lang.IllegalArgumentException: No filesystem found for scheme s3
>     at 
> org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.java:456)
>     at 
> org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:526)
>     at 
> org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1149)
>     at 
> org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1105)
>     at org.apache.beam.sdk.coders.Coder.decode(Coder.java:159)
>     at 
> org.apache.beam.sdk.transforms.join.UnionCoder.decode(UnionCoder.java:83)
>     at 
> org.apache.beam.sdk.transforms.join.UnionCoder.decode(UnionCoder.java:32)
>     at 
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:543)
>     at 
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:534)
>     at 
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:480)
>     at 
> org.apache.beam.runners.flink.translation.types.CoderTypeSerializer.deserialize(CoderTypeSerializer.java:93)
>     at 
> org.apache.flink.runtime.plugable.NonReusingDeserializationDelegate.read(NonReusingDeserializationDelegate.java:55)
>     at 
> org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.getNextRecord(SpillingAdaptiveSpanningRecordDeserializer.java:106)
>     at 
> org.apache.flink.runtime.io.network.api.reader.AbstractRecordReader.getNextRecord(AbstractRecordReader.java:72)
>     at 
> org.apache.flink.runtime.io.network.api.reader.MutableRecordReader.next(MutableRecordReader.java:47)
>     at 
> org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:73)
>     at 
> org.apache.flink.runtime.operators.FlatMapDriver.run(FlatMapDriver.java:107)
>     at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:503)
>     at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
>     at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
>     at java.lang.Thread.run(Thread.java:748)
>  {code}
> For reference, the write code resembles this:
> {code:java}
>  FileIO.Write write = FileIO.write()
>     .via(ParquetIO.sink(schema))
>     .to(options.getOutputDir()). // will be something like: 
> s3:///
>     .withSuffix(".parquet");
> records.apply(String.format("Write(%s)", options.getOutputDir()), 
> write);{code}
> The issue does not appear to be related to ParquetIO.sink().  I am able to 
> reliably reproduce the issue using JSON formatted records and TextIO.sink(), 
> as well.  Moreover, AvroIO is affected if withWindowedWrites() option is 
> added.
> Just trying some different knobs, I went ahead and set the following option:
> {code:java}
> write = write.withNoSpilling();{code}
> This actually seemed to fix the issue, only to have it reemerge as I scaled 
> up the data set size.  The stack trace, while very 

[jira] [Work logged] (BEAM-7933) Adding timeout to JobServer grpc calls

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7933?focusedWorklogId=320919=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320919
 ]

ASF GitHub Bot logged work on BEAM-7933:


Author: ASF GitHub Bot
Created on: 30/Sep/19 22:36
Start Date: 30/Sep/19 22:36
Worklog Time Spent: 10m 
  Work Description: ecanzonieri commented on pull request #9673: 
[BEAM-7933] Add job server request timeout (default to 60 seconds)
URL: https://github.com/apache/beam/pull/9673#discussion_r329815849
 
 

 ##
 File path: sdks/python/apache_beam/runners/portability/portable_runner.py
 ##
 @@ -251,7 +253,11 @@ def send_options_request(max_retries=5):
   # This reports channel is READY but connections may fail
   # Seems to be only an issue on Mac with port forwardings
   return job_service.DescribePipelineOptions(
-  beam_job_api_pb2.DescribePipelineOptionsRequest())
+  beam_job_api_pb2.DescribePipelineOptionsRequest(),
+  timeout=portable_options.job_server_timeout)
+except grpc.FutureTimeoutError:
+  # no retry for timeout errors
 
 Review comment:
   My guess is that we want to retry for other errors, but we should treat 
timeout as unretriable, so that we never wait for longer than the specified 
timeout (including retries). We could implement exponential backoff so that it 
will overall still wait for no longer than the timeout, but I'm not sure it's 
worth the additional complexity.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320919)
Time Spent: 1h 10m  (was: 1h)

> Adding timeout to JobServer grpc calls
> --
>
> Key: BEAM-7933
> URL: https://issues.apache.org/jira/browse/BEAM-7933
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Affects Versions: 2.14.0
>Reporter: Enrico Canzonieri
>Assignee: Enrico Canzonieri
>Priority: Minor
>  Labels: portability
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> grpc calls to the JobServer from the Python SDK do not have timeouts. That 
> means that the call to pipeline.run()could hang forever if the JobServer is 
> not running (or failing to start).
> E.g. 
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/portable_runner.py#L307]
>  the call to Prepare() doesn't provide any timeout value and the same applies 
> to other JobServer requests.
> As part of this ticket we could add a default timeout of 60 seconds as the 
> default timeout for http client.
> Additionally, we could consider adding a --job-server-request-timeout to the 
> [PortableOptions|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/options/pipeline_options.py#L805]
>  class to be used in the JobServer interactions inside probable_runner.py.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8312) Flink portable pipeline jars do not need to stage artifacts remotely

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8312?focusedWorklogId=320909=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320909
 ]

ASF GitHub Bot logged work on BEAM-8312:


Author: ASF GitHub Bot
Created on: 30/Sep/19 22:09
Start Date: 30/Sep/19 22:09
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #9681: [BEAM-8312] 
ArtifactRetrievalService serving artifacts from jar.
URL: https://github.com/apache/beam/pull/9681#discussion_r329808387
 
 

 ##
 File path: 
runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/artifact/ClassLoaderArtifactServiceTest.java
 ##
 @@ -0,0 +1,412 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.google.common.base.Charsets;
+import com.google.common.collect.ImmutableMap;
+import java.io.ByteArrayOutputStream;
+import java.io.FileOutputStream;
+import java.io.IOException;
+import java.net.URI;
+import java.net.URL;
+import java.net.URLClassLoader;
+import java.nio.charset.Charset;
+import java.nio.file.FileSystem;
+import java.nio.file.FileSystems;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.zip.ZipEntry;
+import java.util.zip.ZipOutputStream;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi;
+import org.apache.beam.model.jobmanagement.v1.ArtifactRetrievalServiceGrpc;
+import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc;
+import org.apache.beam.runners.fnexecution.GrpcFnServer;
+import org.apache.beam.runners.fnexecution.InProcessServerFactory;
+import org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.ByteString;
+import org.apache.beam.vendor.grpc.v1p21p0.io.grpc.ManagedChannel;
+import 
org.apache.beam.vendor.grpc.v1p21p0.io.grpc.inprocess.InProcessChannelBuilder;
+import org.apache.beam.vendor.grpc.v1p21p0.io.grpc.stub.StreamObserver;
+import org.junit.Assert;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/**
+ * Tests for {@link ClassLoaderArtifactRetrievalService} and {@link
+ * JavaFilesystemArtifactStagingService}.
+ */
+@RunWith(JUnit4.class)
+public class ClassLoaderArtifactServiceTest {
+
+  @Rule public TemporaryFolder tempFolder = new TemporaryFolder();
+
+  private static final int ARTIFACT_CHUNK_SIZE = 100;
+
+  private static final Charset BIJECTIVE_CHARSET = Charsets.ISO_8859_1;
+
+  public interface ArtifactServicePair extends AutoCloseable {
+
+public String getStagingToken(String nonce);
+
+public ArtifactStagingServiceGrpc.ArtifactStagingServiceStub 
createStagingStub()
+throws Exception;
+
+public ArtifactStagingServiceGrpc.ArtifactStagingServiceBlockingStub 
createStagingBlockingStub()
+throws Exception;
+
+public ArtifactRetrievalServiceGrpc.ArtifactRetrievalServiceStub 
createRetrievalStub()
+throws Exception;
+
+public ArtifactRetrievalServiceGrpc.ArtifactRetrievalServiceBlockingStub
+createRetrievalBlockingStub() throws Exception;
+  }
+
+  private ArtifactServicePair classLoaderService() throws IOException {
+return new ArtifactServicePair() {
+
+  JavaFilesystemArtifactStagingService stagingService;
+  GrpcFnServer stagingServer;
+  ClassLoaderArtifactRetrievalService retrievalService;
+  GrpcFnServer retrievalServer;
+
+  ArtifactStagingServiceGrpc.ArtifactStagingServiceStub stagingStub;
+  ArtifactStagingServiceGrpc.ArtifactStagingServiceBlockingStub 
stagingBlockingStub;
+  ArtifactRetrievalServiceGrpc.ArtifactRetrievalServiceStub retrievalStub;
+  ArtifactRetrievalServiceGrpc.ArtifactRetrievalServiceBlockingStub 
retrievalBlockingStub;
+
+  Path jarPath = 

[jira] [Work logged] (BEAM-8312) Flink portable pipeline jars do not need to stage artifacts remotely

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8312?focusedWorklogId=320908=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320908
 ]

ASF GitHub Bot logged work on BEAM-8312:


Author: ASF GitHub Bot
Created on: 30/Sep/19 22:07
Start Date: 30/Sep/19 22:07
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #9681: [BEAM-8312] 
ArtifactRetrievalService serving artifacts from jar.
URL: https://github.com/apache/beam/pull/9681#discussion_r329807780
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactRetrievalService.java
 ##
 @@ -53,129 +36,45 @@
  * the artifact layout and retrieval token format produced by {@link
  * BeamFileSystemArtifactStagingService}.
  */
-public class BeamFileSystemArtifactRetrievalService
-extends ArtifactRetrievalServiceGrpc.ArtifactRetrievalServiceImplBase
-implements ArtifactRetrievalService {
+public class BeamFileSystemArtifactRetrievalService extends 
AbstractArtifactRetrievalService {
   private static final Logger LOG =
   LoggerFactory.getLogger(BeamFileSystemArtifactRetrievalService.class);
 
-  private static final int ARTIFACT_CHUNK_SIZE_BYTES = 2 << 20; // 2MB
+  private static final Cache MANIFEST_CACHE 
=
+  CacheBuilder.newBuilder()
+  .expireAfterAccess(1, TimeUnit.HOURS /* arbitrary */)
+  .maximumSize(100 /* arbitrary */)
+  .build();
+
+  public BeamFileSystemArtifactRetrievalService() {
+super(MANIFEST_CACHE);
+  }
 
   public static BeamFileSystemArtifactRetrievalService create() {
 return new BeamFileSystemArtifactRetrievalService();
   }
 
   @Override
-  public void getManifest(
-  ArtifactApi.GetManifestRequest request,
-  StreamObserver responseObserver) {
-final String token = request.getRetrievalToken();
-if (Strings.isNullOrEmpty(token)) {
-  throw new StatusRuntimeException(
-  Status.INVALID_ARGUMENT.withDescription("Empty artifact token"));
-}
+  public InputStream openUri(String retrievalToken, String uri) throws 
IOException {
+ResourceId artifactResourceId = FileSystems.matchNewResource(uri, false /* 
is directory */);
 
 Review comment:
   Yeah. This is trying to say "is directory = false." Saying `false /* not a 
directory */` would also be ambiguous, given that we use this convention 
elsewhere. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320908)
Time Spent: 40m  (was: 0.5h)

> Flink portable pipeline jars do not need to stage artifacts remotely
> 
>
> Key: BEAM-8312
> URL: https://issues.apache.org/jira/browse/BEAM-8312
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently, Flink job jars re-stage all artifacts at runtime (on the 
> JobManager) by using the usual BeamFileSystemArtifactRetrievalService [1]. 
> However, since the manifest and all the artifacts live on the classpath of 
> the jar, and everything from the classpath is copied to the Flink workers 
> anyway [2], it should not be necessary to do additional artifact staging. We 
> could replace BeamFileSystemArtifactRetrievalService in this case with a 
> simple ArtifactRetrievalService that just pulls the artifacts from the 
> classpath.
>  
>  [1] 
> [https://github.com/apache/beam/blob/340c3202b1e5824b959f5f9f626e4c7c7842a3cb/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactRetrievalService.java]
> [2] 
> [https://github.com/apache/beam/blob/2f1b56ccc506054e40afe4793a8b556e872e1865/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkExecutionEnvironments.java#L93]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7223) Add ValidatesRunner test suite for Flink on Python 3.

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7223?focusedWorklogId=320907=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320907
 ]

ASF GitHub Bot logged work on BEAM-7223:


Author: ASF GitHub Bot
Created on: 30/Sep/19 22:04
Start Date: 30/Sep/19 22:04
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #9691: [BEAM-7223] Add 
Python 3.5 Flink ValidatesRunner postcommit suite. 
URL: https://github.com/apache/beam/pull/9691#issuecomment-536771823
 
 
   Run Python 3.5 Flink ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320907)
Time Spent: 4h  (was: 3h 50m)

> Add ValidatesRunner test suite for Flink on Python 3.
> -
>
> Key: BEAM-7223
> URL: https://issues.apache.org/jira/browse/BEAM-7223
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Ankur Goenka
>Assignee: Valentyn Tymofieiev
>Priority: Major
> Fix For: 2.17.0
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Add py3 integration tests for Flink



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8021) Add Automatic-Module-Name headers for Beam Java modules

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8021?focusedWorklogId=320905=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320905
 ]

ASF GitHub Bot logged work on BEAM-8021:


Author: ASF GitHub Bot
Created on: 30/Sep/19 22:00
Start Date: 30/Sep/19 22:00
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #9698: [BEAM-8021] Swap 
build-tools to be compile only so it isn't a "required" dependency of Flink.
URL: https://github.com/apache/beam/pull/9698#issuecomment-536764603
 
 
   R: @lgajowy @iemejia @aaltay @mxm 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320905)
Time Spent: 9.5h  (was: 9h 20m)

> Add Automatic-Module-Name headers for Beam Java modules 
> 
>
> Key: BEAM-8021
> URL: https://issues.apache.org/jira/browse/BEAM-8021
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Ismaël Mejía
>Assignee: Lukasz Gajowy
>Priority: Minor
> Fix For: 2.17.0
>
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> For compatibility with the Java Platform Module System (JPMS) in Java 9 and 
> later, every JAR should have a module name, even if the library does not 
> itself use modules. As [suggested in the mailing 
> list|https://lists.apache.org/thread.html/956065580ce049481e756482dc3ccfdc994fef3b8cdb37cab3e2d9b1@%3Cdev.beam.apache.org%3E],
>  this is a simple change that we can do and still be backwards compatible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7933) Adding timeout to JobServer grpc calls

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7933?focusedWorklogId=320900=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320900
 ]

ASF GitHub Bot logged work on BEAM-7933:


Author: ASF GitHub Bot
Created on: 30/Sep/19 21:53
Start Date: 30/Sep/19 21:53
Worklog Time Spent: 10m 
  Work Description: ecanzonieri commented on pull request #9673: 
[BEAM-7933] Add job server request timeout (default to 60 seconds)
URL: https://github.com/apache/beam/pull/9673#discussion_r329803384
 
 

 ##
 File path: sdks/python/apache_beam/runners/portability/portable_runner.py
 ##
 @@ -251,7 +253,11 @@ def send_options_request(max_retries=5):
   # This reports channel is READY but connections may fail
   # Seems to be only an issue on Mac with port forwardings
   return job_service.DescribePipelineOptions(
-  beam_job_api_pb2.DescribePipelineOptionsRequest())
+  beam_job_api_pb2.DescribePipelineOptionsRequest(),
+  timeout=portable_options.job_server_timeout)
 
 Review comment:
   My guess is that this is invoking the LocalJobServicer that doesn't have the 
timeout argument. I'll try to repro locally to verify. If that's the case I'd 
like to add the timeout argument to all LocalJobServicer methods.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320900)
Time Spent: 1h  (was: 50m)

> Adding timeout to JobServer grpc calls
> --
>
> Key: BEAM-7933
> URL: https://issues.apache.org/jira/browse/BEAM-7933
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Affects Versions: 2.14.0
>Reporter: Enrico Canzonieri
>Assignee: Enrico Canzonieri
>Priority: Minor
>  Labels: portability
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> grpc calls to the JobServer from the Python SDK do not have timeouts. That 
> means that the call to pipeline.run()could hang forever if the JobServer is 
> not running (or failing to start).
> E.g. 
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/portable_runner.py#L307]
>  the call to Prepare() doesn't provide any timeout value and the same applies 
> to other JobServer requests.
> As part of this ticket we could add a default timeout of 60 seconds as the 
> default timeout for http client.
> Additionally, we could consider adding a --job-server-request-timeout to the 
> [PortableOptions|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/options/pipeline_options.py#L805]
>  class to be used in the JobServer interactions inside probable_runner.py.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7223) Add ValidatesRunner test suite for Flink on Python 3.

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7223?focusedWorklogId=320899=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320899
 ]

ASF GitHub Bot logged work on BEAM-7223:


Author: ASF GitHub Bot
Created on: 30/Sep/19 21:51
Start Date: 30/Sep/19 21:51
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #9691: [BEAM-7223] Add 
Python 3.5 Flink ValidatesRunner postcommit suite. 
URL: https://github.com/apache/beam/pull/9691#issuecomment-536768069
 
 
   Run Seed Job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320899)
Time Spent: 3h 50m  (was: 3h 40m)

> Add ValidatesRunner test suite for Flink on Python 3.
> -
>
> Key: BEAM-7223
> URL: https://issues.apache.org/jira/browse/BEAM-7223
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Ankur Goenka
>Assignee: Valentyn Tymofieiev
>Priority: Major
> Fix For: 2.17.0
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Add py3 integration tests for Flink



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-5559) Beam Dependency Update Request: com.google.guava:guava

2019-09-30 Thread Luke Cwik (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-5559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941360#comment-16941360
 ] 

Luke Cwik commented on BEAM-5559:
-

Updating guava may be difficult because of the backwards incompatible changes 
that were done to guava and the large number of dependencies of Beam that rely 
on it.

> Beam Dependency Update Request: com.google.guava:guava
> --
>
> Key: BEAM-5559
> URL: https://issues.apache.org/jira/browse/BEAM-5559
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Priority: Major
> Fix For: 2.15.0
>
>
>  - 2018-10-01 19:30:53.471497 
> -
> Please consider upgrading the dependency com.google.guava:guava. 
> The current version is 20.0. The latest version is 26.0-jre 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2018-10-08 12:18:05.174889 
> -
> Please consider upgrading the dependency com.google.guava:guava. 
> The current version is 20.0. The latest version is 26.0-jre 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-04-15 12:32:27.737694 
> -
> Please consider upgrading the dependency com.google.guava:guava. 
> The current version is 20.0. The latest version is 27.1-jre 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-04-22 12:10:18.539470 
> -
> Please consider upgrading the dependency com.google.guava:guava. 
> The current version is 20.0. The latest version is 27.1-jre 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8330) PubSubIO.readAvros should produce a schema'd PCollection if clazz has a schema

2019-09-30 Thread Brian Hulette (Jira)
Brian Hulette created BEAM-8330:
---

 Summary: PubSubIO.readAvros should produce a schema'd PCollection 
if clazz has a schema
 Key: BEAM-8330
 URL: https://issues.apache.org/jira/browse/BEAM-8330
 Project: Beam
  Issue Type: Improvement
  Components: io-java-gcp
Affects Versions: 2.16.0
Reporter: Brian Hulette
Assignee: Brian Hulette
 Fix For: 2.17.0


Currently {{PubsubIO.readAvros(clazz)}} *always* yields a PCollection with an 
AvroCoder. This should only be a fallback in the event that no coder can be 
inferred. That way if we can infer a schema for `clazz` we will produce a 
PCollection with a schema.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-5559) Beam Dependency Update Request: com.google.guava:guava

2019-09-30 Thread Luke Cwik (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-5559:

Status: Open  (was: Triage Needed)

> Beam Dependency Update Request: com.google.guava:guava
> --
>
> Key: BEAM-5559
> URL: https://issues.apache.org/jira/browse/BEAM-5559
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Priority: Major
> Fix For: 2.15.0
>
>
>  - 2018-10-01 19:30:53.471497 
> -
> Please consider upgrading the dependency com.google.guava:guava. 
> The current version is 20.0. The latest version is 26.0-jre 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2018-10-08 12:18:05.174889 
> -
> Please consider upgrading the dependency com.google.guava:guava. 
> The current version is 20.0. The latest version is 26.0-jre 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-04-15 12:32:27.737694 
> -
> Please consider upgrading the dependency com.google.guava:guava. 
> The current version is 20.0. The latest version is 27.1-jre 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-04-22 12:10:18.539470 
> -
> Please consider upgrading the dependency com.google.guava:guava. 
> The current version is 20.0. The latest version is 27.1-jre 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-5085) Beam Dependency Update Request: com.google.guava:guava 26.0-jre

2019-09-30 Thread Luke Cwik (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941356#comment-16941356
 ] 

Luke Cwik commented on BEAM-5085:
-

I closed this issue because the dependency management tool that cuts these bugs 
now just cuts one till it is upgraded which is 
https://issues.apache.org/jira/browse/BEAM-5559 (I accidentally closed that one 
as well when trying to perform some Jira cleanup.)

> Beam Dependency Update Request: com.google.guava:guava 26.0-jre
> ---
>
> Key: BEAM-5085
> URL: https://issues.apache.org/jira/browse/BEAM-5085
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Luke Cwik
>Priority: Major
> Fix For: Not applicable
>
>
> 2018-08-06 12:11:45.524503
> Please review and upgrade the com.google.guava:guava to the latest 
> version 26.0-jre 
>  
> cc: 
> 2018-08-13 12:13:13.070372
> Please review and upgrade the com.google.guava:guava to the latest 
> version 26.0-jre 
>  
> cc: 
> 2018-08-20 12:13:53.326708
> Please review and upgrade the com.google.guava:guava to the latest 
> version 26.0-jre 
>  
> cc: 
> 2018-08-27 12:14:56.293907
> Please review and upgrade the com.google.guava:guava to the latest 
> version 26.0-jre 
>  
> cc: 
> 2018-09-03 12:27:58.181965
> Please review and upgrade the com.google.guava:guava to the latest 
> version 26.0-jre 
>  
> cc: 
> 2018-09-10 12:16:59.770320
> Please review and upgrade the com.google.guava:guava to the latest 
> version 26.0-jre 
>  
> cc: 
> 2018-09-17 12:19:31.661114
> Please review and upgrade the com.google.guava:guava to the latest 
> version 26.0-jre 
>  
> cc: 
> 2018-09-24 12:26:47.172437
> Please review and upgrade the com.google.guava:guava to the latest 
> version 26.0-jre 
>  
> cc: 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (BEAM-5559) Beam Dependency Update Request: com.google.guava:guava

2019-09-30 Thread Luke Cwik (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik reopened BEAM-5559:
-
  Assignee: (was: Luke Cwik)

> Beam Dependency Update Request: com.google.guava:guava
> --
>
> Key: BEAM-5559
> URL: https://issues.apache.org/jira/browse/BEAM-5559
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Priority: Major
> Fix For: 2.15.0
>
>
>  - 2018-10-01 19:30:53.471497 
> -
> Please consider upgrading the dependency com.google.guava:guava. 
> The current version is 20.0. The latest version is 26.0-jre 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2018-10-08 12:18:05.174889 
> -
> Please consider upgrading the dependency com.google.guava:guava. 
> The current version is 20.0. The latest version is 26.0-jre 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-04-15 12:32:27.737694 
> -
> Please consider upgrading the dependency com.google.guava:guava. 
> The current version is 20.0. The latest version is 27.1-jre 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-04-22 12:10:18.539470 
> -
> Please consider upgrading the dependency com.google.guava:guava. 
> The current version is 20.0. The latest version is 27.1-jre 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8329) AvroCoder ReflectData doesn't use TimestampConversion

2019-09-30 Thread Brian Hulette (Jira)
Brian Hulette created BEAM-8329:
---

 Summary: AvroCoder ReflectData doesn't use TimestampConversion
 Key: BEAM-8329
 URL: https://issues.apache.org/jira/browse/BEAM-8329
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Affects Versions: 2.16.0
Reporter: Brian Hulette
Assignee: Brian Hulette
 Fix For: 2.17.0


The ReflectData created by AvroCoder doesn't have 
{{TimeConversions.TimestampConversion}} registered which prevents it from 
working with millis-instant joda DateTime instances. 

AvroUtils does this statically 
[here|https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/utils/AvroUtils.java#L78].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8021) Add Automatic-Module-Name headers for Beam Java modules

2019-09-30 Thread Luke Cwik (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941351#comment-16941351
 ] 

Luke Cwik commented on BEAM-8021:
-

I closed Lukasz Gajowy's PR and migrated Flink runner to not export the 
build-tools artifact as a required dependency in 
[https://github.com/apache/beam/pull/9698]

> Add Automatic-Module-Name headers for Beam Java modules 
> 
>
> Key: BEAM-8021
> URL: https://issues.apache.org/jira/browse/BEAM-8021
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Ismaël Mejía
>Assignee: Lukasz Gajowy
>Priority: Minor
> Fix For: 2.17.0
>
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> For compatibility with the Java Platform Module System (JPMS) in Java 9 and 
> later, every JAR should have a module name, even if the library does not 
> itself use modules. As [suggested in the mailing 
> list|https://lists.apache.org/thread.html/956065580ce049481e756482dc3ccfdc994fef3b8cdb37cab3e2d9b1@%3Cdev.beam.apache.org%3E],
>  this is a simple change that we can do and still be backwards compatible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-6896) Beam Dependency Update Request: PyYAML

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6896?focusedWorklogId=320893=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320893
 ]

ASF GitHub Bot logged work on BEAM-6896:


Author: ASF GitHub Bot
Created on: 30/Sep/19 21:40
Start Date: 30/Sep/19 21:40
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #9699: [BEAM-6896] 
Loosen PyYAML dependency.
URL: https://github.com/apache/beam/pull/9699
 
 
   Also update this to be a test-only dependency, as it's only used for tests.
   
   There were no relevant API changes with the last major release.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-8021) Add Automatic-Module-Name headers for Beam Java modules

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8021?focusedWorklogId=320892=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320892
 ]

ASF GitHub Bot logged work on BEAM-8021:


Author: ASF GitHub Bot
Created on: 30/Sep/19 21:40
Start Date: 30/Sep/19 21:40
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #9698: [BEAM-8021] Swap 
build-tools to be compile only so it isn't a "required" dependency of Flink.
URL: https://github.com/apache/beam/pull/9698#issuecomment-536764603
 
 
   @lgajowy @iemejia @aaltay 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320892)
Time Spent: 9h 10m  (was: 9h)

> Add Automatic-Module-Name headers for Beam Java modules 
> 
>
> Key: BEAM-8021
> URL: https://issues.apache.org/jira/browse/BEAM-8021
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Ismaël Mejía
>Assignee: Lukasz Gajowy
>Priority: Minor
> Fix For: 2.17.0
>
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> For compatibility with the Java Platform Module System (JPMS) in Java 9 and 
> later, every JAR should have a module name, even if the library does not 
> itself use modules. As [suggested in the mailing 
> list|https://lists.apache.org/thread.html/956065580ce049481e756482dc3ccfdc994fef3b8cdb37cab3e2d9b1@%3Cdev.beam.apache.org%3E],
>  this is a simple change that we can do and still be backwards compatible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-6896) Beam Dependency Update Request: PyYAML

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6896?focusedWorklogId=320895=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320895
 ]

ASF GitHub Bot logged work on BEAM-6896:


Author: ASF GitHub Bot
Created on: 30/Sep/19 21:40
Start Date: 30/Sep/19 21:40
Worklog Time Spent: 10m 
  Work Description: robertwb commented on issue #9699: [BEAM-6896] Loosen 
PyYAML dependency.
URL: https://github.com/apache/beam/pull/9699#issuecomment-536764669
 
 
   R: @yifanzou
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320895)
Time Spent: 20m  (was: 10m)

> Beam Dependency Update Request: PyYAML
> --
>
> Key: BEAM-6896
> URL: https://issues.apache.org/jira/browse/BEAM-6896
> Project: Beam
>  Issue Type: Bug
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: 2.17.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
>  - 2019-03-25 04:17:47.501359 
> -
> Please consider upgrading the dependency PyYAML. 
> The current version is 3.13. The latest version is 5.1 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8021) Add Automatic-Module-Name headers for Beam Java modules

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8021?focusedWorklogId=320894=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320894
 ]

ASF GitHub Bot logged work on BEAM-8021:


Author: ASF GitHub Bot
Created on: 30/Sep/19 21:40
Start Date: 30/Sep/19 21:40
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #9698: [BEAM-8021] Swap 
build-tools to be compile only so it isn't a "required" dependency of Flink.
URL: https://github.com/apache/beam/pull/9698#issuecomment-536764603
 
 
   R: @lgajowy @iemejia @aaltay 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320894)
Time Spent: 9h 20m  (was: 9h 10m)

> Add Automatic-Module-Name headers for Beam Java modules 
> 
>
> Key: BEAM-8021
> URL: https://issues.apache.org/jira/browse/BEAM-8021
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Ismaël Mejía
>Assignee: Lukasz Gajowy
>Priority: Minor
> Fix For: 2.17.0
>
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> For compatibility with the Java Platform Module System (JPMS) in Java 9 and 
> later, every JAR should have a module name, even if the library does not 
> itself use modules. As [suggested in the mailing 
> list|https://lists.apache.org/thread.html/956065580ce049481e756482dc3ccfdc994fef3b8cdb37cab3e2d9b1@%3Cdev.beam.apache.org%3E],
>  this is a simple change that we can do and still be backwards compatible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8021) Add Automatic-Module-Name headers for Beam Java modules

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8021?focusedWorklogId=320889=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320889
 ]

ASF GitHub Bot logged work on BEAM-8021:


Author: ASF GitHub Bot
Created on: 30/Sep/19 21:38
Start Date: 30/Sep/19 21:38
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #9690: [BEAM-8021] Publish 
build-tools jar again due to flink runner usage issues
URL: https://github.com/apache/beam/pull/9690#issuecomment-536763886
 
 
   Closing in favor of #9698 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320889)
Time Spent: 8h 50m  (was: 8h 40m)

> Add Automatic-Module-Name headers for Beam Java modules 
> 
>
> Key: BEAM-8021
> URL: https://issues.apache.org/jira/browse/BEAM-8021
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Ismaël Mejía
>Assignee: Lukasz Gajowy
>Priority: Minor
> Fix For: 2.17.0
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> For compatibility with the Java Platform Module System (JPMS) in Java 9 and 
> later, every JAR should have a module name, even if the library does not 
> itself use modules. As [suggested in the mailing 
> list|https://lists.apache.org/thread.html/956065580ce049481e756482dc3ccfdc994fef3b8cdb37cab3e2d9b1@%3Cdev.beam.apache.org%3E],
>  this is a simple change that we can do and still be backwards compatible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8021) Add Automatic-Module-Name headers for Beam Java modules

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8021?focusedWorklogId=320888=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320888
 ]

ASF GitHub Bot logged work on BEAM-8021:


Author: ASF GitHub Bot
Created on: 30/Sep/19 21:37
Start Date: 30/Sep/19 21:37
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #9698: [BEAM-8021] 
Swap build-tools to be compile only so it isn't a "required" dependency of 
Flink.
URL: https://github.com/apache/beam/pull/9698
 
 
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 

[jira] [Commented] (BEAM-6896) Beam Dependency Update Request: PyYAML

2019-09-30 Thread Robert Bradshaw (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-6896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941345#comment-16941345
 ] 

Robert Bradshaw commented on BEAM-6896:
---

This should be an easy fix. 

> Beam Dependency Update Request: PyYAML
> --
>
> Key: BEAM-6896
> URL: https://issues.apache.org/jira/browse/BEAM-6896
> Project: Beam
>  Issue Type: Bug
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: 2.17.0
>
>
>  - 2019-03-25 04:17:47.501359 
> -
> Please consider upgrading the dependency PyYAML. 
> The current version is 3.13. The latest version is 5.1 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-5878) Support DoFns with Keyword-only arguments in Python 3.

2019-09-30 Thread Valentyn Tymofieiev (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-5878:
--
Fix Version/s: (was: 2.16.0)
   2.17.0

> Support DoFns with Keyword-only arguments in Python 3.
> --
>
> Key: BEAM-5878
> URL: https://issues.apache.org/jira/browse/BEAM-5878
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: yoshiki obata
>Priority: Minor
> Fix For: 2.17.0
>
>  Time Spent: 14h 40m
>  Remaining Estimate: 0h
>
> Python 3.0 [adds a possibility|https://www.python.org/dev/peps/pep-3102/] to 
> define functions with keyword-only arguments. 
> Currently Beam does not handle them correctly. [~ruoyu] pointed out [one 
> place|https://github.com/apache/beam/blob/a56ce43109c97c739fa08adca45528c41e3c925c/sdks/python/apache_beam/typehints/decorators.py#L118]
>  in our codebase that we should fix: in Python in 3.0 inspect.getargspec() 
> will fail on functions with keyword-only arguments, but a new method 
> [inspect.getfullargspec()|https://docs.python.org/3/library/inspect.html#inspect.getfullargspec]
>  supports them.
> There may be implications for our (best-effort) type-hints machinery.
> We should also add a Py3-only unit tests that covers DoFn's with keyword-only 
> arguments once Beam Python 3 tests are in a good shape.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7933) Adding timeout to JobServer grpc calls

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7933?focusedWorklogId=320886=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320886
 ]

ASF GitHub Bot logged work on BEAM-7933:


Author: ASF GitHub Bot
Created on: 30/Sep/19 21:29
Start Date: 30/Sep/19 21:29
Worklog Time Spent: 10m 
  Work Description: ecanzonieri commented on pull request #9673: 
[BEAM-7933] Add job server request timeout (default to 60 seconds)
URL: https://github.com/apache/beam/pull/9673#discussion_r329794942
 
 

 ##
 File path: sdks/python/apache_beam/runners/portability/portable_runner.py
 ##
 @@ -141,23 +141,25 @@ def _create_environment(options):
   payload=(portable_options.environment_config.encode('ascii')
if portable_options.environment_config else None))
 
-  def default_job_server(self, options):
+  def default_job_server(self, portable_options):
 # TODO Provide a way to specify a container Docker URL
 # https://issues.apache.org/jira/browse/BEAM-6328
 if not self._dockerized_job_server:
   self._dockerized_job_server = job_server.StopOnExitJobServer(
   job_server.DockerizedJobServer())
 return self._dockerized_job_server
 
-  def create_job_service(self, options):
-job_endpoint = options.view_as(PortableOptions).job_endpoint
+  def create_job_service(self, portable_options):
 
 Review comment:
   I changed it because it is actually immediately converted back to 
portable_options. Basically the caller takes options and converts it to 
portable_options, then to the create_job_service method it passes options which 
is again converted to portable_options. It seems to be that we could pass 
portable_options in the first place and simplify the code a bit. However, I 
don't have a strong option, I'll revert back to options.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320886)
Time Spent: 50m  (was: 40m)

> Adding timeout to JobServer grpc calls
> --
>
> Key: BEAM-7933
> URL: https://issues.apache.org/jira/browse/BEAM-7933
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Affects Versions: 2.14.0
>Reporter: Enrico Canzonieri
>Assignee: Enrico Canzonieri
>Priority: Minor
>  Labels: portability
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> grpc calls to the JobServer from the Python SDK do not have timeouts. That 
> means that the call to pipeline.run()could hang forever if the JobServer is 
> not running (or failing to start).
> E.g. 
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/portable_runner.py#L307]
>  the call to Prepare() doesn't provide any timeout value and the same applies 
> to other JobServer requests.
> As part of this ticket we could add a default timeout of 60 seconds as the 
> default timeout for http client.
> Additionally, we could consider adding a --job-server-request-timeout to the 
> [PortableOptions|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/options/pipeline_options.py#L805]
>  class to be used in the JobServer interactions inside probable_runner.py.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given

2019-09-30 Thread Valentyn Tymofieiev (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev resolved BEAM-8324.
---
Resolution: Fixed

> Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 
> positional arguments but 7 were given
> -
>
> Key: BEAM-8324
> URL: https://issues.apache.org/jira/browse/BEAM-8324
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
> Fix For: 2.16.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> {noformat}
> Dataflow pipeline failed. State: FAILED, Error:
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 773, in run
> self._load_main_session(self.local_staging_directory)
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 489, in _load_main_session
> pickler.load_session(session_file)
>   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", 
> line 287, in load_session
> return dill.load_session(file_path)
>   File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in 
> load_session
> module = unpickler.load()
> TypeError: _create_function() takes from 2 to 6 positional arguments but 7 
> were given
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8324?focusedWorklogId=320882=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320882
 ]

ASF GitHub Bot logged work on BEAM-8324:


Author: ASF GitHub Bot
Created on: 30/Sep/19 21:27
Start Date: 30/Sep/19 21:27
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on issue #9696: [BEAM-8324] 
Restrict the upper bound for dill due to incompatibility between versions 0.3.0 
and 0.3.1.1.
URL: https://github.com/apache/beam/pull/9696#issuecomment-536760418
 
 
   This is cherry pick of https://github.com/apache/beam/pull/9695
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320882)
Time Spent: 1h 50m  (was: 1h 40m)

> Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 
> positional arguments but 7 were given
> -
>
> Key: BEAM-8324
> URL: https://issues.apache.org/jira/browse/BEAM-8324
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
> Fix For: 2.16.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> {noformat}
> Dataflow pipeline failed. State: FAILED, Error:
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 773, in run
> self._load_main_session(self.local_staging_directory)
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 489, in _load_main_session
> pickler.load_session(session_file)
>   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", 
> line 287, in load_session
> return dill.load_session(file_path)
>   File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in 
> load_session
> module = unpickler.load()
> TypeError: _create_function() takes from 2 to 6 positional arguments but 7 
> were given
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8324?focusedWorklogId=320883=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320883
 ]

ASF GitHub Bot logged work on BEAM-8324:


Author: ASF GitHub Bot
Created on: 30/Sep/19 21:27
Start Date: 30/Sep/19 21:27
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #9696: [BEAM-8324] 
Restrict the upper bound for dill due to incompatibility between versions 0.3.0 
and 0.3.1.1.
URL: https://github.com/apache/beam/pull/9696
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320883)
Time Spent: 2h  (was: 1h 50m)

> Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 
> positional arguments but 7 were given
> -
>
> Key: BEAM-8324
> URL: https://issues.apache.org/jira/browse/BEAM-8324
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
> Fix For: 2.16.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> {noformat}
> Dataflow pipeline failed. State: FAILED, Error:
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 773, in run
> self._load_main_session(self.local_staging_directory)
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 489, in _load_main_session
> pickler.load_session(session_file)
>   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", 
> line 287, in load_session
> return dill.load_session(file_path)
>   File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in 
> load_session
> module = unpickler.load()
> TypeError: _create_function() takes from 2 to 6 positional arguments but 7 
> were given
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8183) Optionally bundle multiple pipelines into a single Flink jar

2019-09-30 Thread Ankur Goenka (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941330#comment-16941330
 ] 

Ankur Goenka commented on BEAM-8183:


Flink has some neat feature of picking the pipeline on the fly. 

I don't think this is a very common usecase with Beam though.

Given Beam Job Submission api work on a single pipeline at a time it will be 
very convoluted work flow to introduce multiple pipelines in a single jar.

 

Will it be easier to just store pipelines as separate jar in global storage 
(hdfs etc) and pass the right jar at the time of pipeline submission?

In case the submission happen through a service then will it be easier and less 
error prone to just keep these jars separately on the service and submit the 
right jar to flink based on the parameter?

> Optionally bundle multiple pipelines into a single Flink jar
> 
>
> Key: BEAM-8183
> URL: https://issues.apache.org/jira/browse/BEAM-8183
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>
> [https://github.com/apache/beam/pull/9331#issuecomment-526734851]
> "With Flink you can bundle multiple entry points into the same jar file and 
> specify which one to use with optional flags. It may be desirable to allow 
> inclusion of multiple pipelines for this tool also, although that would 
> require a different workflow. Absent this option, it becomes quite convoluted 
> for users that need the flexibility to choose which pipeline to launch at 
> submission time."



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-5878) Support DoFns with Keyword-only arguments in Python 3.

2019-09-30 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-5878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941329#comment-16941329
 ] 

Valentyn Tymofieiev commented on BEAM-5878:
---

We have restricted dill version on master due to stackfoverflow error 
(https://issues.apache.org/jira/browse/BEAM-8324?focusedCommentId=16941261=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16941261),
 thanks for reporting it.

Also, https://github.com/apache/beam/pull/9696 will restrict dill version on 
2.16.0, keyword-only arguments won't yet be available in 2.16.0 yet, but. We 
can continue to investigate the problem related to dill upgrade in this bug.

> Support DoFns with Keyword-only arguments in Python 3.
> --
>
> Key: BEAM-5878
> URL: https://issues.apache.org/jira/browse/BEAM-5878
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: yoshiki obata
>Priority: Minor
> Fix For: 2.16.0
>
>  Time Spent: 14h 40m
>  Remaining Estimate: 0h
>
> Python 3.0 [adds a possibility|https://www.python.org/dev/peps/pep-3102/] to 
> define functions with keyword-only arguments. 
> Currently Beam does not handle them correctly. [~ruoyu] pointed out [one 
> place|https://github.com/apache/beam/blob/a56ce43109c97c739fa08adca45528c41e3c925c/sdks/python/apache_beam/typehints/decorators.py#L118]
>  in our codebase that we should fix: in Python in 3.0 inspect.getargspec() 
> will fail on functions with keyword-only arguments, but a new method 
> [inspect.getfullargspec()|https://docs.python.org/3/library/inspect.html#inspect.getfullargspec]
>  supports them.
> There may be implications for our (best-effort) type-hints machinery.
> We should also add a Py3-only unit tests that covers DoFn's with keyword-only 
> arguments once Beam Python 3 tests are in a good shape.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-5878) Support DoFns with Keyword-only arguments in Python 3.

2019-09-30 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-5878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941329#comment-16941329
 ] 

Valentyn Tymofieiev edited comment on BEAM-5878 at 9/30/19 9:07 PM:


We have restricted dill version on master due to stackfoverflow error 
(https://issues.apache.org/jira/browse/BEAM-8324?focusedCommentId=16941261=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16941261),
 thanks for reporting it.

Also, https://github.com/apache/beam/pull/9696 will restrict dill version on 
2.16.0, so keyword-only arguments won't yet be available in 2.16.0 yet. We can 
continue to investigate the problem related to dill upgrade in this bug.


was (Author: tvalentyn):
We have restricted dill version on master due to stackfoverflow error 
(https://issues.apache.org/jira/browse/BEAM-8324?focusedCommentId=16941261=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16941261),
 thanks for reporting it.

Also, https://github.com/apache/beam/pull/9696 will restrict dill version on 
2.16.0, keyword-only arguments won't yet be available in 2.16.0 yet, but. We 
can continue to investigate the problem related to dill upgrade in this bug.

> Support DoFns with Keyword-only arguments in Python 3.
> --
>
> Key: BEAM-5878
> URL: https://issues.apache.org/jira/browse/BEAM-5878
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: yoshiki obata
>Priority: Minor
> Fix For: 2.16.0
>
>  Time Spent: 14h 40m
>  Remaining Estimate: 0h
>
> Python 3.0 [adds a possibility|https://www.python.org/dev/peps/pep-3102/] to 
> define functions with keyword-only arguments. 
> Currently Beam does not handle them correctly. [~ruoyu] pointed out [one 
> place|https://github.com/apache/beam/blob/a56ce43109c97c739fa08adca45528c41e3c925c/sdks/python/apache_beam/typehints/decorators.py#L118]
>  in our codebase that we should fix: in Python in 3.0 inspect.getargspec() 
> will fail on functions with keyword-only arguments, but a new method 
> [inspect.getfullargspec()|https://docs.python.org/3/library/inspect.html#inspect.getfullargspec]
>  supports them.
> There may be implications for our (best-effort) type-hints machinery.
> We should also add a Py3-only unit tests that covers DoFn's with keyword-only 
> arguments once Beam Python 3 tests are in a good shape.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8324?focusedWorklogId=320874=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320874
 ]

ASF GitHub Bot logged work on BEAM-8324:


Author: ASF GitHub Bot
Created on: 30/Sep/19 21:00
Start Date: 30/Sep/19 21:00
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #9696: [BEAM-8324] 
Restrict the upper bound for dill due to incompatibility between versions 0.3.0 
and 0.3.1.1.
URL: https://github.com/apache/beam/pull/9696#issuecomment-536750631
 
 
   R: @markflyhigh 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320874)
Time Spent: 1h 40m  (was: 1.5h)

> Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 
> positional arguments but 7 were given
> -
>
> Key: BEAM-8324
> URL: https://issues.apache.org/jira/browse/BEAM-8324
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
> Fix For: 2.16.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> {noformat}
> Dataflow pipeline failed. State: FAILED, Error:
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 773, in run
> self._load_main_session(self.local_staging_directory)
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 489, in _load_main_session
> pickler.load_session(session_file)
>   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", 
> line 287, in load_session
> return dill.load_session(file_path)
>   File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in 
> load_session
> module = unpickler.load()
> TypeError: _create_function() takes from 2 to 6 positional arguments but 7 
> were given
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8324?focusedWorklogId=320859=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320859
 ]

ASF GitHub Bot logged work on BEAM-8324:


Author: ASF GitHub Bot
Created on: 30/Sep/19 20:59
Start Date: 30/Sep/19 20:59
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #9695: [BEAM-8324] 
Restrict the upper bound for dill due to incompatibility between versions 0.3.0 
and 0.3.1.1.
URL: https://github.com/apache/beam/pull/9695
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320859)
Time Spent: 1.5h  (was: 1h 20m)

> Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 
> positional arguments but 7 were given
> -
>
> Key: BEAM-8324
> URL: https://issues.apache.org/jira/browse/BEAM-8324
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
> Fix For: 2.16.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> {noformat}
> Dataflow pipeline failed. State: FAILED, Error:
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 773, in run
> self._load_main_session(self.local_staging_directory)
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 489, in _load_main_session
> pickler.load_session(session_file)
>   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", 
> line 287, in load_session
> return dill.load_session(file_path)
>   File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in 
> load_session
> module = unpickler.load()
> TypeError: _create_function() takes from 2 to 6 positional arguments but 7 
> were given
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8324?focusedWorklogId=320858=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320858
 ]

ASF GitHub Bot logged work on BEAM-8324:


Author: ASF GitHub Bot
Created on: 30/Sep/19 20:57
Start Date: 30/Sep/19 20:57
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #9695: [BEAM-8324] 
Restrict the upper bound for dill due to incompatibility between versions 0.3.0 
and 0.3.1.1.
URL: https://github.com/apache/beam/pull/9695#issuecomment-536749358
 
 
   R: @aaltay or @markflyhigh 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320858)
Time Spent: 1h 20m  (was: 1h 10m)

> Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 
> positional arguments but 7 were given
> -
>
> Key: BEAM-8324
> URL: https://issues.apache.org/jira/browse/BEAM-8324
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
> Fix For: 2.16.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> {noformat}
> Dataflow pipeline failed. State: FAILED, Error:
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 773, in run
> self._load_main_session(self.local_staging_directory)
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 489, in _load_main_session
> pickler.load_session(session_file)
>   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", 
> line 287, in load_session
> return dill.load_session(file_path)
>   File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in 
> load_session
> module = unpickler.load()
> TypeError: _create_function() takes from 2 to 6 positional arguments but 7 
> were given
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-5840) Add unit tests for import scripts

2019-09-30 Thread Mikhail Gryzykhin (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Gryzykhin reassigned BEAM-5840:
---

Assignee: (was: Mikhail Gryzykhin)

> Add unit tests for import scripts
> -
>
> Key: BEAM-5840
> URL: https://issues.apache.org/jira/browse/BEAM-5840
> Project: Beam
>  Issue Type: Sub-task
>  Components: project-management
>Reporter: Scott Wegner
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> For custom import scripts, we should refactor the logic to be testable and 
> add some basic unit tests.
> At the very least, there should be some "smoke test" that the local 
> deployment works.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-5923) Utilize common .pylintrc with python sdk for .test-infra/metrics

2019-09-30 Thread Mikhail Gryzykhin (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Gryzykhin reassigned BEAM-5923:
---

Assignee: (was: Mikhail Gryzykhin)

> Utilize common .pylintrc with python sdk for .test-infra/metrics
> 
>
> Key: BEAM-5923
> URL: https://issues.apache.org/jira/browse/BEAM-5923
> Project: Beam
>  Issue Type: Sub-task
>  Components: project-management
>Reporter: Mikhail Gryzykhin
>Priority: Major
>
> Add common linter and formatter to metrics code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8303) Filesystems not properly registered using FileIO.write()

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8303?focusedWorklogId=320852=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320852
 ]

ASF GitHub Bot logged work on BEAM-8303:


Author: ASF GitHub Bot
Created on: 30/Sep/19 20:46
Start Date: 30/Sep/19 20:46
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on issue #9688: [BEAM-8303] 
Ensure FileSystems registration code runs in non UDF Flink operators
URL: https://github.com/apache/beam/pull/9688#issuecomment-536744819
 
 
   Portable_Python failure is known and tracking in 
https://issues.apache.org/jira/browse/BEAM-8324.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320852)
Time Spent: 40m  (was: 0.5h)

> Filesystems not properly registered using FileIO.write()
> 
>
> Key: BEAM-8303
> URL: https://issues.apache.org/jira/browse/BEAM-8303
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.15.0
>Reporter: Preston Koprivica
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.16.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I’m getting the following error when attempting to use the FileIO apis 
> (beam-2.15.0) and integrating with AWS S3.  I have setup the PipelineOptions 
> with all the relevant AWS options, so the filesystem registry **should** be 
> properly seeded by the time the graph is compiled and executed:
> {code:java}
>  java.lang.IllegalArgumentException: No filesystem found for scheme s3
>     at 
> org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.java:456)
>     at 
> org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:526)
>     at 
> org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1149)
>     at 
> org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1105)
>     at org.apache.beam.sdk.coders.Coder.decode(Coder.java:159)
>     at 
> org.apache.beam.sdk.transforms.join.UnionCoder.decode(UnionCoder.java:83)
>     at 
> org.apache.beam.sdk.transforms.join.UnionCoder.decode(UnionCoder.java:32)
>     at 
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:543)
>     at 
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:534)
>     at 
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:480)
>     at 
> org.apache.beam.runners.flink.translation.types.CoderTypeSerializer.deserialize(CoderTypeSerializer.java:93)
>     at 
> org.apache.flink.runtime.plugable.NonReusingDeserializationDelegate.read(NonReusingDeserializationDelegate.java:55)
>     at 
> org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.getNextRecord(SpillingAdaptiveSpanningRecordDeserializer.java:106)
>     at 
> org.apache.flink.runtime.io.network.api.reader.AbstractRecordReader.getNextRecord(AbstractRecordReader.java:72)
>     at 
> org.apache.flink.runtime.io.network.api.reader.MutableRecordReader.next(MutableRecordReader.java:47)
>     at 
> org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:73)
>     at 
> org.apache.flink.runtime.operators.FlatMapDriver.run(FlatMapDriver.java:107)
>     at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:503)
>     at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
>     at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
>     at java.lang.Thread.run(Thread.java:748)
>  {code}
> For reference, the write code resembles this:
> {code:java}
>  FileIO.Write write = FileIO.write()
>     .via(ParquetIO.sink(schema))
>     .to(options.getOutputDir()). // will be something like: 
> s3:///
>     .withSuffix(".parquet");
> records.apply(String.format("Write(%s)", options.getOutputDir()), 
> write);{code}
> The issue does not appear to be related to ParquetIO.sink().  I am able to 
> reliably reproduce the issue using JSON formatted records and TextIO.sink(), 
> as well.  Moreover, AvroIO is affected if withWindowedWrites() option is 
> added.
> Just trying some different knobs, I went ahead and set the following option:
> {code:java}
> write = write.withNoSpilling();{code}
> This actually seemed to fix the issue, only to have it reemerge as I scaled 
> up the data set size.  The stack trace, 

[jira] [Commented] (BEAM-6816) More PR information in the code velocity dashboard?

2019-09-30 Thread Mikhail Gryzykhin (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-6816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941314#comment-16941314
 ] 

Mikhail Gryzykhin commented on BEAM-6816:
-

Unassigning since I'm not actively working on this task.

> More PR information in the code velocity dashboard?
> ---
>
> Key: BEAM-6816
> URL: https://issues.apache.org/jira/browse/BEAM-6816
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Pablo Estrada
>Priority: Major
>
> I've been using this dashboard: 
> http://104.154.241.245/d/code_velocity/code-velocity?orgId=1
> To triage and get PRs reviewed. Some of them have DO NOT REVIEW / or special 
> instructions in the title.
> Perhaps it would be nice to have more information displayed on the dashboard, 
> so we can skip PRs that don't need to be reviewed.
> As you rightly mentioned, another idea is to exclude PRs that have a specific 
> tag - and I think that's a good idea in fact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8321) beam_PostCommit_PortableJar_Flink postcommit failing

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8321?focusedWorklogId=320847=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320847
 ]

ASF GitHub Bot logged work on BEAM-8321:


Author: ASF GitHub Bot
Created on: 30/Sep/19 20:41
Start Date: 30/Sep/19 20:41
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #9680: [BEAM-8321] fix 
Flink portable jar test
URL: https://github.com/apache/beam/pull/9680
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320847)
Time Spent: 1h 20m  (was: 1h 10m)

> beam_PostCommit_PortableJar_Flink postcommit failing
> 
>
> Key: BEAM-8321
> URL: https://issues.apache.org/jira/browse/BEAM-8321
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Python docker container image names/tags need updated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8321) beam_PostCommit_PortableJar_Flink postcommit failing

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8321?focusedWorklogId=320846=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320846
 ]

ASF GitHub Bot logged work on BEAM-8321:


Author: ASF GitHub Bot
Created on: 30/Sep/19 20:40
Start Date: 30/Sep/19 20:40
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9680: [BEAM-8321] fix Flink 
portable jar test
URL: https://github.com/apache/beam/pull/9680#issuecomment-536742437
 
 
   Failure 
(`org.apache.beam.runners.dataflow.worker.fn.control.TimerReceiverTest.testSingleTimerScheduling`)
 looks unrelated
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320846)
Time Spent: 1h 10m  (was: 1h)

> beam_PostCommit_PortableJar_Flink postcommit failing
> 
>
> Key: BEAM-8321
> URL: https://issues.apache.org/jira/browse/BEAM-8321
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Python docker container image names/tags need updated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-6816) More PR information in the code velocity dashboard?

2019-09-30 Thread Mikhail Gryzykhin (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Gryzykhin reassigned BEAM-6816:
---

Assignee: (was: Mikhail Gryzykhin)

> More PR information in the code velocity dashboard?
> ---
>
> Key: BEAM-6816
> URL: https://issues.apache.org/jira/browse/BEAM-6816
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Pablo Estrada
>Priority: Major
>
> I've been using this dashboard: 
> http://104.154.241.245/d/code_velocity/code-velocity?orgId=1
> To triage and get PRs reviewed. Some of them have DO NOT REVIEW / or special 
> instructions in the title.
> Perhaps it would be nice to have more information displayed on the dashboard, 
> so we can skip PRs that don't need to be reviewed.
> As you rightly mentioned, another idea is to exclude PRs that have a specific 
> tag - and I think that's a good idea in fact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-7969) Streaming Dataflow worker doesn't report FnAPI metrics.

2019-09-30 Thread Mikhail Gryzykhin (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Gryzykhin resolved BEAM-7969.
-
Fix Version/s: 2.17.0
   Resolution: Fixed

> Streaming Dataflow worker doesn't report FnAPI metrics.
> ---
>
> Key: BEAM-7969
> URL: https://issues.apache.org/jira/browse/BEAM-7969
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution, runner-dataflow
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
> Fix For: 2.17.0
>
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> EOM



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-7902) :beam-test-infra-metrics:test is consistently failing

2019-09-30 Thread Mikhail Gryzykhin (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941312#comment-16941312
 ] 

Mikhail Gryzykhin edited comment on BEAM-7902 at 9/30/19 8:34 PM:
--

Closing in favor of

https://issues.apache.org/jira/browse/BEAM-8327

and

https://issues.apache.org/jira/browse/BEAM-8328


was (Author: ardagan):
Closing in favor of https://issues.apache.org/jira/browse/BEAM-8327

> :beam-test-infra-metrics:test is consistently failing
> -
>
> Key: BEAM-7902
> URL: https://issues.apache.org/jira/browse/BEAM-7902
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Mark Liu
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> A failure gradle run from my local machine:  
> https://gradle.com/s/sqcwi7bo3a2nw. Stacktrace:
> {code}
> org.apache.beam.testinfra.metrics.ProberTests > CheckGrafanaStalenessAlerts 
> FAILED
> java.lang.AssertionError: Input data is stale! [id:1, dashboardId:5, 
> dashboardUid:data-freshness, dashboardSlug:source-data-freshness, panelId:2, 
> name:Source Data Freshness alert, state:alerting, 
> newStateDate:2019-07-30T02:55:02Z, evalDate:0001-01-01T00:00:00Z, 
> evalData:[evalMatches:[[metric:GitHub, tags:null, 
> value:482.1661704504]]], executionError:, 
> url:/d/data-freshness/source-data-freshness]
>See: http://104.154.241.245/d/data-freshness. Expression: (alert.state 
> == ok)
> at 
> org.codehaus.groovy.runtime.InvokerHelper.assertFailed(InvokerHelper.java:406)
> at 
> org.codehaus.groovy.runtime.ScriptBytecodeAdapter.assertFailed(ScriptBytecodeAdapter.java:650)
> ...
> {code}
> From the stacktrace suggestion, I went to 
> http://104.154.241.245/d/data-freshness and got Github source is older than a 
> week, which I guess is potentially the failure reason. However, no further 
> instruction on what to do next. 
> After some investigations, I found this test `:beam-test-infra-metrics:test` 
> is added to 
> [beam_Prober_CommunityMetrics|https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_Prober_CommunityMetrics/]
>  but never got passed (either failed or skipped due to cache). 
> Is there anything wrong with Github data refresh? Should we reevaluate the 
> alert on dashboard or modify the test?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-7902) :beam-test-infra-metrics:test is consistently failing

2019-09-30 Thread Mikhail Gryzykhin (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Gryzykhin resolved BEAM-7902.
-
Fix Version/s: Not applicable
   Resolution: Fixed

> :beam-test-infra-metrics:test is consistently failing
> -
>
> Key: BEAM-7902
> URL: https://issues.apache.org/jira/browse/BEAM-7902
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Mark Liu
>Assignee: Mikhail Gryzykhin
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> A failure gradle run from my local machine:  
> https://gradle.com/s/sqcwi7bo3a2nw. Stacktrace:
> {code}
> org.apache.beam.testinfra.metrics.ProberTests > CheckGrafanaStalenessAlerts 
> FAILED
> java.lang.AssertionError: Input data is stale! [id:1, dashboardId:5, 
> dashboardUid:data-freshness, dashboardSlug:source-data-freshness, panelId:2, 
> name:Source Data Freshness alert, state:alerting, 
> newStateDate:2019-07-30T02:55:02Z, evalDate:0001-01-01T00:00:00Z, 
> evalData:[evalMatches:[[metric:GitHub, tags:null, 
> value:482.1661704504]]], executionError:, 
> url:/d/data-freshness/source-data-freshness]
>See: http://104.154.241.245/d/data-freshness. Expression: (alert.state 
> == ok)
> at 
> org.codehaus.groovy.runtime.InvokerHelper.assertFailed(InvokerHelper.java:406)
> at 
> org.codehaus.groovy.runtime.ScriptBytecodeAdapter.assertFailed(ScriptBytecodeAdapter.java:650)
> ...
> {code}
> From the stacktrace suggestion, I went to 
> http://104.154.241.245/d/data-freshness and got Github source is older than a 
> week, which I guess is potentially the failure reason. However, no further 
> instruction on what to do next. 
> After some investigations, I found this test `:beam-test-infra-metrics:test` 
> is added to 
> [beam_Prober_CommunityMetrics|https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_Prober_CommunityMetrics/]
>  but never got passed (either failed or skipped due to cache). 
> Is there anything wrong with Github data refresh? Should we reevaluate the 
> alert on dashboard or modify the test?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-7902) :beam-test-infra-metrics:test is consistently failing

2019-09-30 Thread Mikhail Gryzykhin (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941312#comment-16941312
 ] 

Mikhail Gryzykhin commented on BEAM-7902:
-

Closing in favor of https://issues.apache.org/jira/browse/BEAM-8327

> :beam-test-infra-metrics:test is consistently failing
> -
>
> Key: BEAM-7902
> URL: https://issues.apache.org/jira/browse/BEAM-7902
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Mark Liu
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> A failure gradle run from my local machine:  
> https://gradle.com/s/sqcwi7bo3a2nw. Stacktrace:
> {code}
> org.apache.beam.testinfra.metrics.ProberTests > CheckGrafanaStalenessAlerts 
> FAILED
> java.lang.AssertionError: Input data is stale! [id:1, dashboardId:5, 
> dashboardUid:data-freshness, dashboardSlug:source-data-freshness, panelId:2, 
> name:Source Data Freshness alert, state:alerting, 
> newStateDate:2019-07-30T02:55:02Z, evalDate:0001-01-01T00:00:00Z, 
> evalData:[evalMatches:[[metric:GitHub, tags:null, 
> value:482.1661704504]]], executionError:, 
> url:/d/data-freshness/source-data-freshness]
>See: http://104.154.241.245/d/data-freshness. Expression: (alert.state 
> == ok)
> at 
> org.codehaus.groovy.runtime.InvokerHelper.assertFailed(InvokerHelper.java:406)
> at 
> org.codehaus.groovy.runtime.ScriptBytecodeAdapter.assertFailed(ScriptBytecodeAdapter.java:650)
> ...
> {code}
> From the stacktrace suggestion, I went to 
> http://104.154.241.245/d/data-freshness and got Github source is older than a 
> week, which I guess is potentially the failure reason. However, no further 
> instruction on what to do next. 
> After some investigations, I found this test `:beam-test-infra-metrics:test` 
> is added to 
> [beam_Prober_CommunityMetrics|https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_Prober_CommunityMetrics/]
>  but never got passed (either failed or skipped due to cache). 
> Is there anything wrong with Github data refresh? Should we reevaluate the 
> alert on dashboard or modify the test?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8328) remove :beam-test-infra-metrics:test from build target.

2019-09-30 Thread Mikhail Gryzykhin (Jira)
Mikhail Gryzykhin created BEAM-8328:
---

 Summary: remove :beam-test-infra-metrics:test from build target.
 Key: BEAM-8328
 URL: https://issues.apache.org/jira/browse/BEAM-8328
 Project: Beam
  Issue Type: Bug
  Components: project-management
Reporter: Mikhail Gryzykhin






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8312) Flink portable pipeline jars do not need to stage artifacts remotely

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8312?focusedWorklogId=320837=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320837
 ]

ASF GitHub Bot logged work on BEAM-8312:


Author: ASF GitHub Bot
Created on: 30/Sep/19 20:32
Start Date: 30/Sep/19 20:32
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #9681: [BEAM-8312] 
ArtifactRetrievalService serving artifacts from jar.
URL: https://github.com/apache/beam/pull/9681#discussion_r329772101
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/JavaFilesystemArtifactStagingService.java
 ##
 @@ -0,0 +1,92 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.channels.Channels;
+import java.nio.channels.WritableByteChannel;
+import java.nio.file.FileSystem;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.util.Comparator;
+import java.util.stream.Stream;
+import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc;
+
+/**
+ * An {@link ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase} that 
loads artifacts into a
+ * Java FileSystem.
+ */
+public class JavaFilesystemArtifactStagingService extends 
AbstractArtifactStagingService {
 
 Review comment:
   I'm not sure why this staging service is necessary. 
`PortablePipelineJarCreator` already writes artifacts to a jar as if "staging" 
them. The Flink runner then copies everything from the application jar's 
classpath to the task managers, so we should be able to retrieve artifacts from 
the classpath directly without going through an extra re-staging step.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320837)

> Flink portable pipeline jars do not need to stage artifacts remotely
> 
>
> Key: BEAM-8312
> URL: https://issues.apache.org/jira/browse/BEAM-8312
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, Flink job jars re-stage all artifacts at runtime (on the 
> JobManager) by using the usual BeamFileSystemArtifactRetrievalService [1]. 
> However, since the manifest and all the artifacts live on the classpath of 
> the jar, and everything from the classpath is copied to the Flink workers 
> anyway [2], it should not be necessary to do additional artifact staging. We 
> could replace BeamFileSystemArtifactRetrievalService in this case with a 
> simple ArtifactRetrievalService that just pulls the artifacts from the 
> classpath.
>  
>  [1] 
> [https://github.com/apache/beam/blob/340c3202b1e5824b959f5f9f626e4c7c7842a3cb/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactRetrievalService.java]
> [2] 
> [https://github.com/apache/beam/blob/2f1b56ccc506054e40afe4793a8b556e872e1865/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkExecutionEnvironments.java#L93]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8312) Flink portable pipeline jars do not need to stage artifacts remotely

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8312?focusedWorklogId=320836=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320836
 ]

ASF GitHub Bot logged work on BEAM-8312:


Author: ASF GitHub Bot
Created on: 30/Sep/19 20:32
Start Date: 30/Sep/19 20:32
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #9681: [BEAM-8312] 
ArtifactRetrievalService serving artifacts from jar.
URL: https://github.com/apache/beam/pull/9681#discussion_r329756483
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/AbstractArtifactRetrievalService.java
 ##
 @@ -0,0 +1,199 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import static 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.nio.charset.StandardCharsets;
+import java.util.List;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.TimeUnit;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ProxyManifest;
+import org.apache.beam.model.jobmanagement.v1.ArtifactRetrievalServiceGrpc;
+import org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.ByteString;
+import org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.util.JsonFormat;
+import org.apache.beam.vendor.grpc.v1p21p0.io.grpc.Status;
+import org.apache.beam.vendor.grpc.v1p21p0.io.grpc.StatusRuntimeException;
+import org.apache.beam.vendor.grpc.v1p21p0.io.grpc.stub.StreamObserver;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Strings;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.Cache;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.CacheBuilder;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.hash.Hasher;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.hash.Hashing;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.io.ByteStreams;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * An {@link ArtifactRetrievalService} that handles everything aside from 
actually opening the
+ * backing resources.
 
 Review comment:
   :+1: 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320836)
Time Spent: 20m  (was: 10m)

> Flink portable pipeline jars do not need to stage artifacts remotely
> 
>
> Key: BEAM-8312
> URL: https://issues.apache.org/jira/browse/BEAM-8312
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, Flink job jars re-stage all artifacts at runtime (on the 
> JobManager) by using the usual BeamFileSystemArtifactRetrievalService [1]. 
> However, since the manifest and all the artifacts live on the classpath of 
> the jar, and everything from the classpath is copied to the Flink workers 
> anyway [2], it should not be necessary to do additional artifact staging. We 
> could replace BeamFileSystemArtifactRetrievalService in this case with a 
> simple ArtifactRetrievalService that just pulls the artifacts from the 
> classpath.
>  
>  [1] 
> 

[jira] [Commented] (BEAM-8327) beam_Prober_CommunityMetrics hits cache giving wrong results

2019-09-30 Thread Mikhail Gryzykhin (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941311#comment-16941311
 ] 

Mikhail Gryzykhin commented on BEAM-8327:
-

Came up from this failure.

https://issues.apache.org/jira/browse/BEAM-7902

> beam_Prober_CommunityMetrics hits cache giving wrong results
> 
>
> Key: BEAM-8327
> URL: https://issues.apache.org/jira/browse/BEAM-8327
> Project: Beam
>  Issue Type: Bug
>  Components: project-management
>Reporter: Mikhail Gryzykhin
>Priority: Major
>
> We need to fix beam_Prober_CommunityMetrics target to not hit cache. It 
> always fetches fresh data from website even though binaries are the same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8327) beam_Prober_CommunityMetrics hits cache giving wrong results

2019-09-30 Thread Mikhail Gryzykhin (Jira)
Mikhail Gryzykhin created BEAM-8327:
---

 Summary: beam_Prober_CommunityMetrics hits cache giving wrong 
results
 Key: BEAM-8327
 URL: https://issues.apache.org/jira/browse/BEAM-8327
 Project: Beam
  Issue Type: Bug
  Components: project-management
Reporter: Mikhail Gryzykhin


We need to fix beam_Prober_CommunityMetrics target to not hit cache. It always 
fetches fresh data from website even though binaries are the same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8312) Flink portable pipeline jars do not need to stage artifacts remotely

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8312?focusedWorklogId=320838=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320838
 ]

ASF GitHub Bot logged work on BEAM-8312:


Author: ASF GitHub Bot
Created on: 30/Sep/19 20:32
Start Date: 30/Sep/19 20:32
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #9681: [BEAM-8312] 
ArtifactRetrievalService serving artifacts from jar.
URL: https://github.com/apache/beam/pull/9681#discussion_r329758955
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactRetrievalService.java
 ##
 @@ -53,129 +36,45 @@
  * the artifact layout and retrieval token format produced by {@link
  * BeamFileSystemArtifactStagingService}.
  */
-public class BeamFileSystemArtifactRetrievalService
-extends ArtifactRetrievalServiceGrpc.ArtifactRetrievalServiceImplBase
-implements ArtifactRetrievalService {
+public class BeamFileSystemArtifactRetrievalService extends 
AbstractArtifactRetrievalService {
   private static final Logger LOG =
   LoggerFactory.getLogger(BeamFileSystemArtifactRetrievalService.class);
 
-  private static final int ARTIFACT_CHUNK_SIZE_BYTES = 2 << 20; // 2MB
+  private static final Cache MANIFEST_CACHE 
=
+  CacheBuilder.newBuilder()
+  .expireAfterAccess(1, TimeUnit.HOURS /* arbitrary */)
+  .maximumSize(100 /* arbitrary */)
+  .build();
+
+  public BeamFileSystemArtifactRetrievalService() {
+super(MANIFEST_CACHE);
+  }
 
   public static BeamFileSystemArtifactRetrievalService create() {
 return new BeamFileSystemArtifactRetrievalService();
   }
 
   @Override
-  public void getManifest(
-  ArtifactApi.GetManifestRequest request,
-  StreamObserver responseObserver) {
-final String token = request.getRetrievalToken();
-if (Strings.isNullOrEmpty(token)) {
-  throw new StatusRuntimeException(
-  Status.INVALID_ARGUMENT.withDescription("Empty artifact token"));
-}
+  public InputStream openUri(String retrievalToken, String uri) throws 
IOException {
+ResourceId artifactResourceId = FileSystems.matchNewResource(uri, false /* 
is directory */);
 
 Review comment:
   Nit: I know this was like this before, but "is directory" is ambiguous -- is 
this trying to say "false, meaning it's a directory," or "is directory = false"?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320838)

> Flink portable pipeline jars do not need to stage artifacts remotely
> 
>
> Key: BEAM-8312
> URL: https://issues.apache.org/jira/browse/BEAM-8312
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, Flink job jars re-stage all artifacts at runtime (on the 
> JobManager) by using the usual BeamFileSystemArtifactRetrievalService [1]. 
> However, since the manifest and all the artifacts live on the classpath of 
> the jar, and everything from the classpath is copied to the Flink workers 
> anyway [2], it should not be necessary to do additional artifact staging. We 
> could replace BeamFileSystemArtifactRetrievalService in this case with a 
> simple ArtifactRetrievalService that just pulls the artifacts from the 
> classpath.
>  
>  [1] 
> [https://github.com/apache/beam/blob/340c3202b1e5824b959f5f9f626e4c7c7842a3cb/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactRetrievalService.java]
> [2] 
> [https://github.com/apache/beam/blob/2f1b56ccc506054e40afe4793a8b556e872e1865/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkExecutionEnvironments.java#L93]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8312) Flink portable pipeline jars do not need to stage artifacts remotely

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8312?focusedWorklogId=320839=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320839
 ]

ASF GitHub Bot logged work on BEAM-8312:


Author: ASF GitHub Bot
Created on: 30/Sep/19 20:32
Start Date: 30/Sep/19 20:32
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #9681: [BEAM-8312] 
ArtifactRetrievalService serving artifacts from jar.
URL: https://github.com/apache/beam/pull/9681#discussion_r329763525
 
 

 ##
 File path: 
runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/artifact/ClassLoaderArtifactServiceTest.java
 ##
 @@ -0,0 +1,412 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.artifact;
+
+import com.google.common.base.Charsets;
+import com.google.common.collect.ImmutableMap;
+import java.io.ByteArrayOutputStream;
+import java.io.FileOutputStream;
+import java.io.IOException;
+import java.net.URI;
+import java.net.URL;
+import java.net.URLClassLoader;
+import java.nio.charset.Charset;
+import java.nio.file.FileSystem;
+import java.nio.file.FileSystems;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
+import java.util.zip.ZipEntry;
+import java.util.zip.ZipOutputStream;
+import org.apache.beam.model.jobmanagement.v1.ArtifactApi;
+import org.apache.beam.model.jobmanagement.v1.ArtifactRetrievalServiceGrpc;
+import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc;
+import org.apache.beam.runners.fnexecution.GrpcFnServer;
+import org.apache.beam.runners.fnexecution.InProcessServerFactory;
+import org.apache.beam.vendor.grpc.v1p21p0.com.google.protobuf.ByteString;
+import org.apache.beam.vendor.grpc.v1p21p0.io.grpc.ManagedChannel;
+import 
org.apache.beam.vendor.grpc.v1p21p0.io.grpc.inprocess.InProcessChannelBuilder;
+import org.apache.beam.vendor.grpc.v1p21p0.io.grpc.stub.StreamObserver;
+import org.junit.Assert;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/**
+ * Tests for {@link ClassLoaderArtifactRetrievalService} and {@link
+ * JavaFilesystemArtifactStagingService}.
+ */
+@RunWith(JUnit4.class)
+public class ClassLoaderArtifactServiceTest {
+
+  @Rule public TemporaryFolder tempFolder = new TemporaryFolder();
+
+  private static final int ARTIFACT_CHUNK_SIZE = 100;
+
+  private static final Charset BIJECTIVE_CHARSET = Charsets.ISO_8859_1;
+
+  public interface ArtifactServicePair extends AutoCloseable {
+
+public String getStagingToken(String nonce);
+
+public ArtifactStagingServiceGrpc.ArtifactStagingServiceStub 
createStagingStub()
+throws Exception;
+
+public ArtifactStagingServiceGrpc.ArtifactStagingServiceBlockingStub 
createStagingBlockingStub()
+throws Exception;
+
+public ArtifactRetrievalServiceGrpc.ArtifactRetrievalServiceStub 
createRetrievalStub()
+throws Exception;
+
+public ArtifactRetrievalServiceGrpc.ArtifactRetrievalServiceBlockingStub
+createRetrievalBlockingStub() throws Exception;
+  }
+
+  private ArtifactServicePair classLoaderService() throws IOException {
+return new ArtifactServicePair() {
+
+  JavaFilesystemArtifactStagingService stagingService;
+  GrpcFnServer stagingServer;
+  ClassLoaderArtifactRetrievalService retrievalService;
+  GrpcFnServer retrievalServer;
+
+  ArtifactStagingServiceGrpc.ArtifactStagingServiceStub stagingStub;
+  ArtifactStagingServiceGrpc.ArtifactStagingServiceBlockingStub 
stagingBlockingStub;
+  ArtifactRetrievalServiceGrpc.ArtifactRetrievalServiceStub retrievalStub;
+  ArtifactRetrievalServiceGrpc.ArtifactRetrievalServiceBlockingStub 
retrievalBlockingStub;
+
+  Path jarPath = 

[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=320835=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320835
 ]

ASF GitHub Bot logged work on BEAM-5428:


Author: ASF GitHub Bot
Created on: 30/Sep/19 20:30
Start Date: 30/Sep/19 20:30
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #9418: [BEAM-5428] Implement 
cross-bundle user state caching in the Python SDK
URL: https://github.com/apache/beam/pull/9418#issuecomment-536738137
 
 
   Python tests broken on master, see 
https://builds.apache.org/job/beam_PreCommit_Python_Cron/ and 
https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320835)
Time Spent: 28h 20m  (was: 28h 10m)

> Implement cross-bundle state caching.
> -
>
> Key: BEAM-5428
> URL: https://issues.apache.org/jira/browse/BEAM-5428
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 28h 20m
>  Remaining Estimate: 0h
>
> Tech spec: 
> [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m]
> Relevant document: 
> [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit]
> Mailing list link: 
> [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given

2019-09-30 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941287#comment-16941287
 ] 

Valentyn Tymofieiev commented on BEAM-8324:
---

Sent https://github.com/apache/beam/pull/9695. See also: 
https://github.com/uqfoundation/dill/issues/341.

> Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 
> positional arguments but 7 were given
> -
>
> Key: BEAM-8324
> URL: https://issues.apache.org/jira/browse/BEAM-8324
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
> Fix For: 2.16.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> {noformat}
> Dataflow pipeline failed. State: FAILED, Error:
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 773, in run
> self._load_main_session(self.local_staging_directory)
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 489, in _load_main_session
> pickler.load_session(session_file)
>   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", 
> line 287, in load_session
> return dill.load_session(file_path)
>   File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in 
> load_session
> module = unpickler.load()
> TypeError: _create_function() takes from 2 to 6 positional arguments but 7 
> were given
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given

2019-09-30 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941284#comment-16941284
 ] 

Valentyn Tymofieiev commented on BEAM-8324:
---

I think 0.3.0. will be a safer option for now. We can continue investigate 
0.3.1.1. changes in https://issues.apache.org/jira/browse/BEAM-5878. 

> Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 
> positional arguments but 7 were given
> -
>
> Key: BEAM-8324
> URL: https://issues.apache.org/jira/browse/BEAM-8324
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
> Fix For: 2.16.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> {noformat}
> Dataflow pipeline failed. State: FAILED, Error:
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 773, in run
> self._load_main_session(self.local_staging_directory)
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 489, in _load_main_session
> pickler.load_session(session_file)
>   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", 
> line 287, in load_session
> return dill.load_session(file_path)
>   File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in 
> load_session
> module = unpickler.load()
> TypeError: _create_function() takes from 2 to 6 positional arguments but 7 
> were given
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8324?focusedWorklogId=320804=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320804
 ]

ASF GitHub Bot logged work on BEAM-8324:


Author: ASF GitHub Bot
Created on: 30/Sep/19 19:33
Start Date: 30/Sep/19 19:33
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #9696: [BEAM-8324] 
Restrict the upper bound for dill due to incompatibility between versions 0.3.0 
and 0.3.1.1.
URL: https://github.com/apache/beam/pull/9696
 
 
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
 | --- | --- | [![Build 

[jira] [Work logged] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8324?focusedWorklogId=320791=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320791
 ]

ASF GitHub Bot logged work on BEAM-8324:


Author: ASF GitHub Bot
Created on: 30/Sep/19 19:28
Start Date: 30/Sep/19 19:28
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #9693: [BEAM-8324] 
Restrict the upper bound for dill due to incompatibility between versions 0.3.0 
and 0.3.1.1.
URL: https://github.com/apache/beam/pull/9693#issuecomment-536715145
 
 
   Closing in favor of https://github.com/apache/beam/pull/9695
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320791)
Time Spent: 40m  (was: 0.5h)

> Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 
> positional arguments but 7 were given
> -
>
> Key: BEAM-8324
> URL: https://issues.apache.org/jira/browse/BEAM-8324
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
> Fix For: 2.16.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {noformat}
> Dataflow pipeline failed. State: FAILED, Error:
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 773, in run
> self._load_main_session(self.local_staging_directory)
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 489, in _load_main_session
> pickler.load_session(session_file)
>   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", 
> line 287, in load_session
> return dill.load_session(file_path)
>   File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in 
> load_session
> module = unpickler.load()
> TypeError: _create_function() takes from 2 to 6 positional arguments but 7 
> were given
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8324?focusedWorklogId=320793=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320793
 ]

ASF GitHub Bot logged work on BEAM-8324:


Author: ASF GitHub Bot
Created on: 30/Sep/19 19:28
Start Date: 30/Sep/19 19:28
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #9694: [BEAM-8324] 
Restrict the upper bound for dill due to incompatibility between versions 0.3.0 
and 0.3.1.1.
URL: https://github.com/apache/beam/pull/9694
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320793)
Time Spent: 1h  (was: 50m)

> Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 
> positional arguments but 7 were given
> -
>
> Key: BEAM-8324
> URL: https://issues.apache.org/jira/browse/BEAM-8324
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
> Fix For: 2.16.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> {noformat}
> Dataflow pipeline failed. State: FAILED, Error:
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 773, in run
> self._load_main_session(self.local_staging_directory)
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 489, in _load_main_session
> pickler.load_session(session_file)
>   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", 
> line 287, in load_session
> return dill.load_session(file_path)
>   File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in 
> load_session
> module = unpickler.load()
> TypeError: _create_function() takes from 2 to 6 positional arguments but 7 
> were given
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8324?focusedWorklogId=320792=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320792
 ]

ASF GitHub Bot logged work on BEAM-8324:


Author: ASF GitHub Bot
Created on: 30/Sep/19 19:28
Start Date: 30/Sep/19 19:28
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #9693: [BEAM-8324] 
Restrict the upper bound for dill due to incompatibility between versions 0.3.0 
and 0.3.1.1.
URL: https://github.com/apache/beam/pull/9693
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320792)
Time Spent: 50m  (was: 40m)

> Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 
> positional arguments but 7 were given
> -
>
> Key: BEAM-8324
> URL: https://issues.apache.org/jira/browse/BEAM-8324
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
> Fix For: 2.16.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {noformat}
> Dataflow pipeline failed. State: FAILED, Error:
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 773, in run
> self._load_main_session(self.local_staging_directory)
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 489, in _load_main_session
> pickler.load_session(session_file)
>   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", 
> line 287, in load_session
> return dill.load_session(file_path)
>   File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in 
> load_session
> module = unpickler.load()
> TypeError: _create_function() takes from 2 to 6 positional arguments but 7 
> were given
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8324?focusedWorklogId=320790=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320790
 ]

ASF GitHub Bot logged work on BEAM-8324:


Author: ASF GitHub Bot
Created on: 30/Sep/19 19:27
Start Date: 30/Sep/19 19:27
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #9695: [BEAM-8324] 
Restrict the upper bound for dill due to incompatibility between versions 0.3.0 
and 0.3.1.1.
URL: https://github.com/apache/beam/pull/9695
 
 
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
 | --- | --- | [![Build 

[jira] [Commented] (BEAM-8303) Filesystems not properly registered using FileIO.write()

2019-09-30 Thread Preston Koprivica (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941273#comment-16941273
 ] 

Preston Koprivica commented on BEAM-8303:
-

[~mxm] [~markflyhigh] I was able to test the fix by pulling Max's branch.  I 
can verify that with the fix I'm no longer seeing the original error.  Thanks 
so much for diagnosing and thanks for the quick turnaround.

> Filesystems not properly registered using FileIO.write()
> 
>
> Key: BEAM-8303
> URL: https://issues.apache.org/jira/browse/BEAM-8303
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.15.0
>Reporter: Preston Koprivica
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.16.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> I’m getting the following error when attempting to use the FileIO apis 
> (beam-2.15.0) and integrating with AWS S3.  I have setup the PipelineOptions 
> with all the relevant AWS options, so the filesystem registry **should** be 
> properly seeded by the time the graph is compiled and executed:
> {code:java}
>  java.lang.IllegalArgumentException: No filesystem found for scheme s3
>     at 
> org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.java:456)
>     at 
> org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:526)
>     at 
> org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1149)
>     at 
> org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1105)
>     at org.apache.beam.sdk.coders.Coder.decode(Coder.java:159)
>     at 
> org.apache.beam.sdk.transforms.join.UnionCoder.decode(UnionCoder.java:83)
>     at 
> org.apache.beam.sdk.transforms.join.UnionCoder.decode(UnionCoder.java:32)
>     at 
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:543)
>     at 
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:534)
>     at 
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:480)
>     at 
> org.apache.beam.runners.flink.translation.types.CoderTypeSerializer.deserialize(CoderTypeSerializer.java:93)
>     at 
> org.apache.flink.runtime.plugable.NonReusingDeserializationDelegate.read(NonReusingDeserializationDelegate.java:55)
>     at 
> org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.getNextRecord(SpillingAdaptiveSpanningRecordDeserializer.java:106)
>     at 
> org.apache.flink.runtime.io.network.api.reader.AbstractRecordReader.getNextRecord(AbstractRecordReader.java:72)
>     at 
> org.apache.flink.runtime.io.network.api.reader.MutableRecordReader.next(MutableRecordReader.java:47)
>     at 
> org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:73)
>     at 
> org.apache.flink.runtime.operators.FlatMapDriver.run(FlatMapDriver.java:107)
>     at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:503)
>     at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
>     at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
>     at java.lang.Thread.run(Thread.java:748)
>  {code}
> For reference, the write code resembles this:
> {code:java}
>  FileIO.Write write = FileIO.write()
>     .via(ParquetIO.sink(schema))
>     .to(options.getOutputDir()). // will be something like: 
> s3:///
>     .withSuffix(".parquet");
> records.apply(String.format("Write(%s)", options.getOutputDir()), 
> write);{code}
> The issue does not appear to be related to ParquetIO.sink().  I am able to 
> reliably reproduce the issue using JSON formatted records and TextIO.sink(), 
> as well.  Moreover, AvroIO is affected if withWindowedWrites() option is 
> added.
> Just trying some different knobs, I went ahead and set the following option:
> {code:java}
> write = write.withNoSpilling();{code}
> This actually seemed to fix the issue, only to have it reemerge as I scaled 
> up the data set size.  The stack trace, while very similar, reads:
> {code:java}
>  java.lang.IllegalArgumentException: No filesystem found for scheme s3
>     at 
> org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.java:456)
>     at 
> org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:526)
>     at 
> org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1149)
>     at 
> org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1105)
>     at org.apache.beam.sdk.coders.Coder.decode(Coder.java:159)
>     at org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:82)
>     at org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:36)
>   

[jira] [Commented] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given

2019-09-30 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941261#comment-16941261
 ] 

Valentyn Tymofieiev commented on BEAM-8324:
---

I am investigating whether this should be <0.3.1 or <0.3.2. It appears that 
0.3.1.1. causes stackoverflow errors on master in one of the tests on Python 
3.7, when running through tox.
 
{noformat}
test_remote_runner_display_data 
(apache_beam.runners.dataflow.dataflow_runner_test.DataflowRunnerTest) ... 
Fatal Python error: Cannot recover from stack overflow.

Current thread 0x7fc19e177740 (most recent call first):
  File 
"/usr/local/google/home/valentyn/projects/beam/beam/beam/sdks/python/target/.tox/py37-gcp/lib/python3.7/site-packages/dill/_dill.py",
 line 382 in get
  File "/usr/lib/python3.7/pickle.py", line 502 in save
  File "/usr/lib/python3.7/pickle.py", line 771 in save_tuple
  File "/usr/lib/python3.7/pickle.py", line 504 in save
  File "/usr/lib/python3.7/pickle.py", line 786 in save_tuple
  File "/usr/lib/python3.7/pickle.py", line 504 in save
  File "/usr/lib/python3.7/pickle.py", line 638 in save_reduce
...
{noformat}

However there is no error if we run the test by itself:
{noformat}
python ./setup.py test -s 
apache_beam.runners.dataflow.dataflow_runner_test.DataflowRunnerTest.test_remote_runner_display_data
test_remote_runner_display_data 
(apache_beam.runners.dataflow.dataflow_runner_test.DataflowRunnerTest) ... ok

--
Ran 1 test in 0.063s

OK
{noformat}

> Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 
> positional arguments but 7 were given
> -
>
> Key: BEAM-8324
> URL: https://issues.apache.org/jira/browse/BEAM-8324
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
> Fix For: 2.16.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {noformat}
> Dataflow pipeline failed. State: FAILED, Error:
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 773, in run
> self._load_main_session(self.local_staging_directory)
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 489, in _load_main_session
> pickler.load_session(session_file)
>   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", 
> line 287, in load_session
> return dill.load_session(file_path)
>   File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in 
> load_session
> module = unpickler.load()
> TypeError: _create_function() takes from 2 to 6 positional arguments but 7 
> were given
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given

2019-09-30 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941248#comment-16941248
 ] 

Ahmet Altay commented on BEAM-8324:
---

Let's put an upperbound. How about

dill>=0.3.1.1,<0.3.2.

I believe this impacts 2.16.0 release. See: 
https://github.com/apache/beam/blob/release-2.16.0/sdks/python/setup.py

> Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 
> positional arguments but 7 were given
> -
>
> Key: BEAM-8324
> URL: https://issues.apache.org/jira/browse/BEAM-8324
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {noformat}
> Dataflow pipeline failed. State: FAILED, Error:
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 773, in run
> self._load_main_session(self.local_staging_directory)
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 489, in _load_main_session
> pickler.load_session(session_file)
>   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", 
> line 287, in load_session
> return dill.load_session(file_path)
>   File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in 
> load_session
> module = unpickler.load()
> TypeError: _create_function() takes from 2 to 6 positional arguments but 7 
> were given
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given

2019-09-30 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-8324:
--
Priority: Blocker  (was: Major)

> Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 
> positional arguments but 7 were given
> -
>
> Key: BEAM-8324
> URL: https://issues.apache.org/jira/browse/BEAM-8324
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
> Fix For: 2.16.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {noformat}
> Dataflow pipeline failed. State: FAILED, Error:
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 773, in run
> self._load_main_session(self.local_staging_directory)
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 489, in _load_main_session
> pickler.load_session(session_file)
>   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", 
> line 287, in load_session
> return dill.load_session(file_path)
>   File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in 
> load_session
> module = unpickler.load()
> TypeError: _create_function() takes from 2 to 6 positional arguments but 7 
> were given
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8093) py36-gcp is missing from the current tox.ini

2019-09-30 Thread Udi Meiri (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri reassigned BEAM-8093:
---

Assignee: Luke Cwik

> py36-gcp is missing from the current tox.ini
> 
>
> Key: BEAM-8093
> URL: https://issues.apache.org/jira/browse/BEAM-8093
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Affects Versions: 2.16.0
>Reporter: Valentyn Tymofieiev
>Assignee: Luke Cwik
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> cc: [~udim]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-8093) py36-gcp is missing from the current tox.ini

2019-09-30 Thread Udi Meiri (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri resolved BEAM-8093.
-
Fix Version/s: Not applicable
   Resolution: Fixed

> py36-gcp is missing from the current tox.ini
> 
>
> Key: BEAM-8093
> URL: https://issues.apache.org/jira/browse/BEAM-8093
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Affects Versions: 2.16.0
>Reporter: Valentyn Tymofieiev
>Assignee: Luke Cwik
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> cc: [~udim]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-6923) OOM errors in jobServer when using GCS artifactDir

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6923?focusedWorklogId=320736=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320736
 ]

ASF GitHub Bot logged work on BEAM-6923:


Author: ASF GitHub Bot
Created on: 30/Sep/19 18:28
Start Date: 30/Sep/19 18:28
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #9647: [BEAM-6923] 
limit gcs buffer size to 1MB for artifact upload
URL: https://github.com/apache/beam/pull/9647
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320736)
Time Spent: 3h  (was: 2h 50m)

> OOM errors in jobServer when using GCS artifactDir
> --
>
> Key: BEAM-6923
> URL: https://issues.apache.org/jira/browse/BEAM-6923
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-harness
>Reporter: Lukasz Gajowy
>Assignee: Ankur Goenka
>Priority: Major
> Attachments: Instance counts.png, Paths to GC root.png, 
> Telemetries.png, beam6923-flink156.m4v, beam6923flink182.m4v, heapdump 
> size-sorted.png
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> When starting jobServer with artifactDir pointing to a GCS bucket: 
> {code:java}
> ./gradlew :beam-runners-flink_2.11-job-server:runShadow 
> -PflinkMasterUrl=localhost:8081 -PartifactsDir=gs://the-bucket{code}
> and running a Java portable pipeline with the following, portability related 
> pipeline options: 
> {code:java}
> --runner=PortableRunner --jobEndpoint=localhost:8099 
> --defaultEnvironmentType=DOCKER 
> --defaultEnvironmentConfig=gcr.io//java:latest'{code}
>  
> I'm facing a series of OOM errors, like this: 
> {code:java}
> Exception in thread "grpc-default-executor-3" java.lang.OutOfMemoryError: 
> Java heap space
> at 
> com.google.api.client.googleapis.media.MediaHttpUploader.buildContentChunk(MediaHttpUploader.java:606)
> at 
> com.google.api.client.googleapis.media.MediaHttpUploader.resumableUpload(MediaHttpUploader.java:408)
> at 
> com.google.api.client.googleapis.media.MediaHttpUploader.upload(MediaHttpUploader.java:336)
> at 
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:508)
> at 
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:432)
> at 
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:549)
> at 
> com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel$UploadOperation.call(AbstractGoogleAsyncWriteChannel.java:301)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745){code}
>  
> This does not happen when I'm using a local filesystem for the artifact 
> staging location. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8324?focusedWorklogId=320707=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320707
 ]

ASF GitHub Bot logged work on BEAM-8324:


Author: ASF GitHub Bot
Created on: 30/Sep/19 17:52
Start Date: 30/Sep/19 17:52
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #9694: [BEAM-8324] 
Restrict the upper bound for dill due to incompatibility between versions 0.3.0 
and 0.3.1.1.
URL: https://github.com/apache/beam/pull/9694
 
 
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 

[jira] [Commented] (BEAM-8303) Filesystems not properly registered using FileIO.write()

2019-09-30 Thread Mark Liu (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941177#comment-16941177
 ] 

Mark Liu commented on BEAM-8303:


Do you have ETA for the fix? I'm ready to cut a RC. 

> Filesystems not properly registered using FileIO.write()
> 
>
> Key: BEAM-8303
> URL: https://issues.apache.org/jira/browse/BEAM-8303
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.15.0
>Reporter: Preston Koprivica
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.16.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> I’m getting the following error when attempting to use the FileIO apis 
> (beam-2.15.0) and integrating with AWS S3.  I have setup the PipelineOptions 
> with all the relevant AWS options, so the filesystem registry **should** be 
> properly seeded by the time the graph is compiled and executed:
> {code:java}
>  java.lang.IllegalArgumentException: No filesystem found for scheme s3
>     at 
> org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.java:456)
>     at 
> org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:526)
>     at 
> org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1149)
>     at 
> org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1105)
>     at org.apache.beam.sdk.coders.Coder.decode(Coder.java:159)
>     at 
> org.apache.beam.sdk.transforms.join.UnionCoder.decode(UnionCoder.java:83)
>     at 
> org.apache.beam.sdk.transforms.join.UnionCoder.decode(UnionCoder.java:32)
>     at 
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:543)
>     at 
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:534)
>     at 
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:480)
>     at 
> org.apache.beam.runners.flink.translation.types.CoderTypeSerializer.deserialize(CoderTypeSerializer.java:93)
>     at 
> org.apache.flink.runtime.plugable.NonReusingDeserializationDelegate.read(NonReusingDeserializationDelegate.java:55)
>     at 
> org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.getNextRecord(SpillingAdaptiveSpanningRecordDeserializer.java:106)
>     at 
> org.apache.flink.runtime.io.network.api.reader.AbstractRecordReader.getNextRecord(AbstractRecordReader.java:72)
>     at 
> org.apache.flink.runtime.io.network.api.reader.MutableRecordReader.next(MutableRecordReader.java:47)
>     at 
> org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:73)
>     at 
> org.apache.flink.runtime.operators.FlatMapDriver.run(FlatMapDriver.java:107)
>     at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:503)
>     at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
>     at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
>     at java.lang.Thread.run(Thread.java:748)
>  {code}
> For reference, the write code resembles this:
> {code:java}
>  FileIO.Write write = FileIO.write()
>     .via(ParquetIO.sink(schema))
>     .to(options.getOutputDir()). // will be something like: 
> s3:///
>     .withSuffix(".parquet");
> records.apply(String.format("Write(%s)", options.getOutputDir()), 
> write);{code}
> The issue does not appear to be related to ParquetIO.sink().  I am able to 
> reliably reproduce the issue using JSON formatted records and TextIO.sink(), 
> as well.  Moreover, AvroIO is affected if withWindowedWrites() option is 
> added.
> Just trying some different knobs, I went ahead and set the following option:
> {code:java}
> write = write.withNoSpilling();{code}
> This actually seemed to fix the issue, only to have it reemerge as I scaled 
> up the data set size.  The stack trace, while very similar, reads:
> {code:java}
>  java.lang.IllegalArgumentException: No filesystem found for scheme s3
>     at 
> org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.java:456)
>     at 
> org.apache.beam.sdk.io.FileSystems.matchNewResource(FileSystems.java:526)
>     at 
> org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1149)
>     at 
> org.apache.beam.sdk.io.FileBasedSink$FileResultCoder.decode(FileBasedSink.java:1105)
>     at org.apache.beam.sdk.coders.Coder.decode(Coder.java:159)
>     at org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:82)
>     at org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:36)
>     at 
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:543)
>     at 
> 

[jira] [Work logged] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8324?focusedWorklogId=320702=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320702
 ]

ASF GitHub Bot logged work on BEAM-8324:


Author: ASF GitHub Bot
Created on: 30/Sep/19 17:43
Start Date: 30/Sep/19 17:43
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #9693: [BEAM-8324] 
Restrict the upper bound for dill due to incompatibility between versions 0.3.0 
and 0.3.1.1.
URL: https://github.com/apache/beam/pull/9693
 
 
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-7223) Add ValidatesRunner test suite for Flink on Python 3.

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7223?focusedWorklogId=320684=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320684
 ]

ASF GitHub Bot logged work on BEAM-7223:


Author: ASF GitHub Bot
Created on: 30/Sep/19 17:14
Start Date: 30/Sep/19 17:14
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #9691: [BEAM-7223] Add 
Python 3.5 Flink ValidatesRunner postcommit suite. 
URL: https://github.com/apache/beam/pull/9691#issuecomment-536659352
 
 
   Run Python 3.5 Flink ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320684)
Time Spent: 3h 40m  (was: 3.5h)

> Add ValidatesRunner test suite for Flink on Python 3.
> -
>
> Key: BEAM-7223
> URL: https://issues.apache.org/jira/browse/BEAM-7223
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Ankur Goenka
>Assignee: Valentyn Tymofieiev
>Priority: Major
> Fix For: 2.17.0
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Add py3 integration tests for Flink



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-8314) Beam Fn Api metrics piling causes pipeline to stuck after running for a while

2019-09-30 Thread Yichi Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yichi Zhang closed BEAM-8314.
-
Resolution: Fixed

> Beam Fn Api metrics piling causes pipeline to stuck after running for a while
> -
>
> Key: BEAM-8314
> URL: https://issues.apache.org/jira/browse/BEAM-8314
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Yichi Zhang
>Priority: Blocker
> Fix For: 2.16.0
>
> Attachments: E4UaSUhJJKF.png
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Seems that in StreamingDataflowWorker we are not able to update the metrics 
> fast enough to dataflow service, the piling metrics causes memory usage to 
> increase and eventually leads to excessive memory thrashing/GC. And it will 
> almost stop the pipeline from processing new items.
>  
>  !E4UaSUhJJKF.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-7919) Add a Python 3 test scenario for MongoDB IO

2019-09-30 Thread Yichi Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yichi Zhang resolved BEAM-7919.
---
Fix Version/s: 2.17.0
   Resolution: Fixed

> Add a Python 3 test scenario for MongoDB IO
> ---
>
> Key: BEAM-7919
> URL: https://issues.apache.org/jira/browse/BEAM-7919
> Project: Beam
>  Issue Type: Sub-task
>  Components: io-ideas
>Reporter: Valentyn Tymofieiev
>Assignee: Yichi Zhang
>Priority: Major
> Fix For: 2.17.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Python 2 MongoDB IO suite was added in:
> https://github.com/apache/beam/commit/17bf89d6070565b715f44ecb5f6394219b94cfe6
> We should also exercise this IO in Python 3. 
> cc: [~chamikara] [~altay]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7223) Add ValidatesRunner test suite for Flink on Python 3.

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7223?focusedWorklogId=320681=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320681
 ]

ASF GitHub Bot logged work on BEAM-7223:


Author: ASF GitHub Bot
Created on: 30/Sep/19 17:11
Start Date: 30/Sep/19 17:11
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #9691: [BEAM-7223] Add 
Python 3.5 Flink ValidatesRunner postcommit suite. 
URL: https://github.com/apache/beam/pull/9691#issuecomment-536658374
 
 
   Run Python 3.5 ValidatesRunner Flink
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320681)
Time Spent: 3.5h  (was: 3h 20m)

> Add ValidatesRunner test suite for Flink on Python 3.
> -
>
> Key: BEAM-7223
> URL: https://issues.apache.org/jira/browse/BEAM-7223
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Ankur Goenka
>Assignee: Valentyn Tymofieiev
>Priority: Major
> Fix For: 2.17.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Add py3 integration tests for Flink



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (BEAM-7919) Add a Python 3 test scenario for MongoDB IO

2019-09-30 Thread Yichi Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-7919 started by Yichi Zhang.
-
> Add a Python 3 test scenario for MongoDB IO
> ---
>
> Key: BEAM-7919
> URL: https://issues.apache.org/jira/browse/BEAM-7919
> Project: Beam
>  Issue Type: Sub-task
>  Components: io-ideas
>Reporter: Valentyn Tymofieiev
>Assignee: Yichi Zhang
>Priority: Major
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Python 2 MongoDB IO suite was added in:
> https://github.com/apache/beam/commit/17bf89d6070565b715f44ecb5f6394219b94cfe6
> We should also exercise this IO in Python 3. 
> cc: [~chamikara] [~altay]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given

2019-09-30 Thread Valentyn Tymofieiev (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-8324:
--
Status: Open  (was: Triage Needed)

> Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 
> positional arguments but 7 were given
> -
>
> Key: BEAM-8324
> URL: https://issues.apache.org/jira/browse/BEAM-8324
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
> Fix For: 2.16.0
>
>
> {noformat}
> Dataflow pipeline failed. State: FAILED, Error:
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 773, in run
> self._load_main_session(self.local_staging_directory)
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 489, in _load_main_session
> pickler.load_session(session_file)
>   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", 
> line 287, in load_session
> return dill.load_session(file_path)
>   File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in 
> load_session
> module = unpickler.load()
> TypeError: _create_function() takes from 2 to 6 positional arguments but 7 
> were given
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given

2019-09-30 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941147#comment-16941147
 ] 

Valentyn Tymofieiev commented on BEAM-8324:
---

I also checked that 2.15.0 release was not affected by this.

> Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 
> positional arguments but 7 were given
> -
>
> Key: BEAM-8324
> URL: https://issues.apache.org/jira/browse/BEAM-8324
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
> Fix For: 2.16.0
>
>
> {noformat}
> Dataflow pipeline failed. State: FAILED, Error:
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 773, in run
> self._load_main_session(self.local_staging_directory)
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 489, in _load_main_session
> pickler.load_session(session_file)
>   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", 
> line 287, in load_session
> return dill.load_session(file_path)
>   File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in 
> load_session
> module = unpickler.load()
> TypeError: _create_function() takes from 2 to 6 positional arguments but 7 
> were given
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given

2019-09-30 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941146#comment-16941146
 ] 

Valentyn Tymofieiev commented on BEAM-8324:
---

I think we should restore the upper bound for dill, since it may be difficult 
to avoid this breakages unless dill has compatibility guarantees between 
versions.

> Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 
> positional arguments but 7 were given
> -
>
> Key: BEAM-8324
> URL: https://issues.apache.org/jira/browse/BEAM-8324
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
> Fix For: 2.16.0
>
>
> {noformat}
> Dataflow pipeline failed. State: FAILED, Error:
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 773, in run
> self._load_main_session(self.local_staging_directory)
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 489, in _load_main_session
> pickler.load_session(session_file)
>   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", 
> line 287, in load_session
> return dill.load_session(file_path)
>   File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in 
> load_session
> module = unpickler.load()
> TypeError: _create_function() takes from 2 to 6 positional arguments but 7 
> were given
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7389) Colab examples for element-wise transforms (Python)

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7389?focusedWorklogId=320675=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320675
 ]

ASF GitHub Bot logged work on BEAM-7389:


Author: ASF GitHub Bot
Created on: 30/Sep/19 17:04
Start Date: 30/Sep/19 17:04
Worklog Time Spent: 10m 
  Work Description: davidcavazos commented on issue #9664: [BEAM-7389] 
Created code files to match doc filenames
URL: https://github.com/apache/beam/pull/9664#issuecomment-536655440
 
 
   Updated docs and obsolete files deleted in #9692, but they won't render 
correctly until this one is merged.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320675)
Time Spent: 61.5h  (was: 61h 20m)

> Colab examples for element-wise transforms (Python)
> ---
>
> Key: BEAM-7389
> URL: https://issues.apache.org/jira/browse/BEAM-7389
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Rose Nguyen
>Assignee: David Cavazos
>Priority: Minor
>  Time Spent: 61.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-6896) Beam Dependency Update Request: PyYAML

2019-09-30 Thread Yifan Zou (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Zou updated BEAM-6896:

Fix Version/s: (was: Not applicable)
   2.17.0

> Beam Dependency Update Request: PyYAML
> --
>
> Key: BEAM-6896
> URL: https://issues.apache.org/jira/browse/BEAM-6896
> Project: Beam
>  Issue Type: Bug
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: 2.17.0
>
>
>  - 2019-03-25 04:17:47.501359 
> -
> Please consider upgrading the dependency PyYAML. 
> The current version is 3.13. The latest version is 5.1 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-8326) Run Portable_Python PreCommit fails with pickling

2019-09-30 Thread Luke Cwik (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik closed BEAM-8326.
---
Fix Version/s: Not applicable
   Resolution: Duplicate

> Run Portable_Python PreCommit fails with pickling
> -
>
> Key: BEAM-8326
> URL: https://issues.apache.org/jira/browse/BEAM-8326
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Valentyn Tymofieiev
>Priority: Major
> Fix For: Not applicable
>
>
> Test log: 
> https://builds.apache.org/job/beam_PreCommit_Portable_Python_Phrase/434/console
> I see the following error:
> 09:31:34 Traceback (most recent call last):
> 09:31:34   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", 
> line 261, in loads
> 09:31:34 return dill.loads(s)
> 09:31:34   File "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 
> 317, in loads
> 09:31:34 return load(file, ignore)
> 09:31:34   File "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 
> 305, in load
> 09:31:34 obj = pik.load()
> 09:31:34 TypeError: _create_function() takes from 2 to 6 positional arguments 
> but 7 were given
> 09:31:34 
> 09:31:34 During handling of the above exception, another exception occurred:
> 09:31:34 
> 09:31:34 Traceback (most recent call last):
> 09:31:34   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 158, in _execute
> 09:31:34 response = task()
> 09:31:34   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 191, in 
> 09:31:34 self._execute(lambda: worker.do_instruction(work), work)
> 09:31:34   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 343, in do_instruction
> 09:31:34 request.instruction_id)
> 09:31:34   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 363, in process_bundle
> 09:31:34 instruction_id, request.process_bundle_descriptor_id)
> 09:31:34   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 306, in get
> 09:31:34 self.data_channel_factory)
> 09:31:34   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 580, in __init__
> 09:31:34 self.ops = 
> self.create_execution_tree(self.process_bundle_descriptor)
> 09:31:34   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 624, in create_execution_tree
> 09:31:34 descriptor.transforms, key=topological_height, reverse=True)])
> 09:31:34   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 623, in 
> 09:31:34 for transform_id in sorted(
> 09:31:34   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 548, in wrapper
> 09:31:34 result = cache[args] = func(*args)
> 09:31:34   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 607, in get_operation
> 09:31:34 in descriptor.transforms[transform_id].outputs.items()
> 09:31:34   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 606, in 
> 09:31:34 for tag, pcoll_id
> 09:31:34   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 605, in 
> 09:31:34 tag: [get_operation(op) for op in pcoll_consumers[pcoll_id]]
> 09:31:34   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 548, in wrapper
> 09:31:34 result = cache[args] = func(*args)
> 09:31:34   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 610, in get_operation
> 09:31:34 transform_id, transform_consumers)
> 09:31:34   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 869, in create_operation
> 09:31:34 return creator(self, transform_id, transform_proto, payload, 
> consumers)
> 09:31:34   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 1112, in create
> 09:31:34 serialized_fn, parameter)
> 09:31:34   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 1150, in _create_pardo_operation
> 09:31:34 dofn_data = pickler.loads(serialized_fn)
> 09:31:34   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", 
> line 265, in loads
> 09:31:34 return dill.loads(s)
> 09:31:34   File 

[jira] [Comment Edited] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given

2019-09-30 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941140#comment-16941140
 ] 

Valentyn Tymofieiev edited comment on BEAM-8324 at 9/30/19 5:01 PM:


I think the cause of the error is that we pickle main session using 
dill==0.3.1.1, and unpickle it using dill==0.3.0 (which is installed on DF 
workers).
We can update the version of Dill in  DF workers, or restrict the upperbound 
for dill, which may be advisable given [~yoshiki.obata]'s report in 
https://issues.apache.org/jira/browse/BEAM-5878?focusedCommentId=16941114=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16941114.
 


was (Author: tvalentyn):
I think the cause of the error is that we pickle main session using 
dill==0.3.1.1, and unpickle it using dill==0.2.9 (which is installed on DF 
workers).
We can update the version of Dill in  DF workers, or restrict the upperbound 
for dill, which may be advisable given [~yoshiki.obata]'s report in 
https://issues.apache.org/jira/browse/BEAM-5878?focusedCommentId=16941114=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16941114.
 

> Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 
> positional arguments but 7 were given
> -
>
> Key: BEAM-8324
> URL: https://issues.apache.org/jira/browse/BEAM-8324
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
> Fix For: 2.16.0
>
>
> {noformat}
> Dataflow pipeline failed. State: FAILED, Error:
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 773, in run
> self._load_main_session(self.local_staging_directory)
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 489, in _load_main_session
> pickler.load_session(session_file)
>   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", 
> line 287, in load_session
> return dill.load_session(file_path)
>   File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in 
> load_session
> module = unpickler.load()
> TypeError: _create_function() takes from 2 to 6 positional arguments but 7 
> were given
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8324) Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 positional arguments but 7 were given

2019-09-30 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941140#comment-16941140
 ] 

Valentyn Tymofieiev commented on BEAM-8324:
---

I think the cause of the error is that we pickle main session using 
dill==0.3.1.1, and unpickle it using dill==0.2.9 (which is installed on DF 
workers).
We can update the version of Dill in  DF workers, or restrict the upperbound 
for dill, which may be advisable given [~yoshiki.obata]'s report in 
https://issues.apache.org/jira/browse/BEAM-5878?focusedCommentId=16941114=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16941114.
 

> Pre/Postcommit Dataflow IT tests fail: _create_function() takes from 2 to 6 
> positional arguments but 7 were given
> -
>
> Key: BEAM-8324
> URL: https://issues.apache.org/jira/browse/BEAM-8324
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
> Fix For: 2.16.0
>
>
> {noformat}
> Dataflow pipeline failed. State: FAILED, Error:
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 773, in run
> self._load_main_session(self.local_staging_directory)
>   File 
> "/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
> 489, in _load_main_session
> pickler.load_session(session_file)
>   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/internal/pickler.py", 
> line 287, in load_session
> return dill.load_session(file_path)
>   File "/usr/local/lib/python3.7/site-packages/dill/_dill.py", line 410, in 
> load_session
> module = unpickler.load()
> TypeError: _create_function() takes from 2 to 6 positional arguments but 7 
> were given
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7389) Colab examples for element-wise transforms (Python)

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7389?focusedWorklogId=320672=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320672
 ]

ASF GitHub Bot logged work on BEAM-7389:


Author: ASF GitHub Bot
Created on: 30/Sep/19 17:01
Start Date: 30/Sep/19 17:01
Worklog Time Spent: 10m 
  Work Description: davidcavazos commented on pull request #9692: 
[BEAM-7389] Update docs with matching code files
URL: https://github.com/apache/beam/pull/9692
 
 
   **Merge after #9664**
   
   Second phase to update code filenames to match the docs filenames.
   
   * Updates the docs
   * Deletes obsolete code files
   
   TODO:
   
   * Wait until #9664 merges so new code files are available
   * Regenerate the `flatmap` notebook
   
   > Staged files will be broken until the new code files are merged
   
   FlatMap: 
http://apache-beam-website-pull-requests.storage.googleapis.com/9669/documentation/transforms/python/elementwise/flatmap/index.html
   
   ToString: 
http://apache-beam-website-pull-requests.storage.googleapis.com/9669/documentation/transforms/python/elementwise/tostring/index.html
   
   WithTimestamps: 
http://apache-beam-website-pull-requests.storage.googleapis.com/9669/documentation/transforms/python/elementwise/withtimestamps/index.html
   
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 

[jira] [Comment Edited] (BEAM-6896) Beam Dependency Update Request: PyYAML

2019-09-30 Thread Brian Hulette (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-6896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941135#comment-16941135
 ] 

Brian Hulette edited comment on BEAM-6896 at 9/30/19 5:01 PM:
--

[~eblanchi_cni] also pointed out that there was a security flaw in pyyaml, 
resolved in 5.1. Details here https://github.com/yaml/pyyaml/issues/207


was (Author: bhulette):
[~erbianchi] also pointed out that there was a security flaw in pyyaml, 
resolved in 5.1. Details here https://github.com/yaml/pyyaml/issues/207

> Beam Dependency Update Request: PyYAML
> --
>
> Key: BEAM-6896
> URL: https://issues.apache.org/jira/browse/BEAM-6896
> Project: Beam
>  Issue Type: Bug
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: Not applicable
>
>
>  - 2019-03-25 04:17:47.501359 
> -
> Please consider upgrading the dependency PyYAML. 
> The current version is 3.13. The latest version is 5.1 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8300) KinesisIO.write causes NPE as the producer is null

2019-09-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8300?focusedWorklogId=320670=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-320670
 ]

ASF GitHub Bot logged work on BEAM-8300:


Author: ASF GitHub Bot
Created on: 30/Sep/19 16:59
Start Date: 30/Sep/19 16:59
Worklog Time Spent: 10m 
  Work Description: jhalaria commented on issue #9640: [BEAM-8300]: 
KinesisIO.write throws NPE because producer is null
URL: https://github.com/apache/beam/pull/9640#issuecomment-536653666
 
 
   @aromanenko-dev - Updated the commit message. Sorry I was out for a couple 
of days, otherwise would have done it sooner.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 320670)
Time Spent: 4.5h  (was: 4h 20m)

> KinesisIO.write causes NPE as the producer is null
> --
>
> Key: BEAM-8300
> URL: https://issues.apache.org/jira/browse/BEAM-8300
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-kinesis
>Affects Versions: 2.15.0
>Reporter: Ankit Jhalaria
>Assignee: Ankit Jhalaria
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> While using KinesisIO.write(), we encountered a NPE with the following stack 
> trace 
> {code:java}
> org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.run(UnboundedSourceWrapper.java:297)\n\tat
>  
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:93)\n\tat
>  
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:57)\n\tat
>  
> org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:97)\n\tat
>  
> org.apache.flink.streaming.runtime.tasks.StoppableSourceStreamTask.run(StoppableSourceStreamTask.java:45)\n\tat
>  
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)\n\tat
>  org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)\n\tat 
> java.lang.Thread.run(Thread.java:748)\nCaused by: 
> java.lang.NullPointerException: null\n\tat 
> org.apache.beam.sdk.io.kinesis.KinesisIO$Write$KinesisWriterFn.flushBundle(KinesisIO.java:685)\n\tat
>  
> org.apache.beam.sdk.io.kinesis.KinesisIO$Write$KinesisWriterFn.finishBundle(KinesisIO.java:669){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   >