[jira] [Updated] (BEAM-5626) Several IO tests fail in Python 3 with RuntimeError('dictionary changed size during iteration',)}

2018-10-02 Thread Valentyn Tymofieiev (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-5626:
--
Summary: Several IO tests fail in Python 3 with RuntimeError('dictionary 
changed size during iteration',)}  (was: Several IO tests fail in Python  with 
RuntimeError('dictionary changed size during iteration',)})

> Several IO tests fail in Python 3 with RuntimeError('dictionary changed size 
> during iteration',)}
> -
>
> Key: BEAM-5626
> URL: https://issues.apache.org/jira/browse/BEAM-5626
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Priority: Major
>
>  ERROR: test_delete_dir 
> (apache_beam.io.hadoopfilesystem_test.HadoopFileSystemTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/hadoopfilesystem_test.py",
>  line 506, in test_delete_dir
>  self.fs.delete([url_t1])
>File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/hadoopfilesystem.py",
>  line 370, in delete
>  raise BeamIOError("Delete operation failed", exceptions)
>  apache_beam.io.filesystem.BeamIOError: Delete operation failed with 
> exceptions {'hdfs://test_dir/new_dir1': RuntimeError('dictionary changed size 
> during iteration',   )}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5036) Optimize FileBasedSink's WriteOperation.moveToOutput()

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5036?focusedWorklogId=150611=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150611
 ]

ASF GitHub Bot logged work on BEAM-5036:


Author: ASF GitHub Bot
Created on: 03/Oct/18 05:43
Start Date: 03/Oct/18 05:43
Worklog Time Spent: 10m 
  Work Description: timrobertson100 edited a comment on issue #6289: 
[BEAM-5036] Optimize the FileBasedSink WriteOperation.moveToOutput()
URL: https://github.com/apache/beam/pull/6289#issuecomment-426518043
 
 
   I have tested HDFS (the whole point of this PR) and the performance is 
listed above (100 minutes drops to 42 minutes for a basic rewrite of 1.5TB avro 
file on a 10 node cluster).
   
   GCS rename() does a copy() and delete() internally, so should be the same 
runtime but I agree it should be confirmed as should local


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150611)
Time Spent: 9h  (was: 8h 50m)

> Optimize FileBasedSink's WriteOperation.moveToOutput()
> --
>
> Key: BEAM-5036
> URL: https://issues.apache.org/jira/browse/BEAM-5036
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-files
>Affects Versions: 2.5.0
>Reporter: Jozef Vilcek
>Assignee: Tim Robertson
>Priority: Major
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> moveToOutput() methods in FileBasedSink.WriteOperation implements move by 
> copy+delete. It would be better to use a rename() which can be much more 
> effective for some filesystems.
> Filesystem must support cross-directory rename. BEAM-4861 is related to this 
> for the case of HDFS filesystem.
> Feature was discussed here:
> http://mail-archives.apache.org/mod_mbox/beam-dev/201807.mbox/%3CCAF9t7_4Mp54pQ+vRrJrBh9Vx0=uaknupzd_qdh_qdm9vxll...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5036) Optimize FileBasedSink's WriteOperation.moveToOutput()

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5036?focusedWorklogId=150610=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150610
 ]

ASF GitHub Bot logged work on BEAM-5036:


Author: ASF GitHub Bot
Created on: 03/Oct/18 05:43
Start Date: 03/Oct/18 05:43
Worklog Time Spent: 10m 
  Work Description: timrobertson100 edited a comment on issue #6289: 
[BEAM-5036] Optimize the FileBasedSink WriteOperation.moveToOutput()
URL: https://github.com/apache/beam/pull/6289#issuecomment-426518043
 
 
   I have tested HDFS (the whole point of this PR) and the performance is 
listed above (100 minutes drops to 42 minutes for a basic rewrite of 1.5TB avro 
file on a 10 node cluster).
   
   GCS rename does a copy() and delete() internally, so should be the same 
runtime but I agree it should be confirmed as should local


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150610)
Time Spent: 8h 50m  (was: 8h 40m)

> Optimize FileBasedSink's WriteOperation.moveToOutput()
> --
>
> Key: BEAM-5036
> URL: https://issues.apache.org/jira/browse/BEAM-5036
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-files
>Affects Versions: 2.5.0
>Reporter: Jozef Vilcek
>Assignee: Tim Robertson
>Priority: Major
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> moveToOutput() methods in FileBasedSink.WriteOperation implements move by 
> copy+delete. It would be better to use a rename() which can be much more 
> effective for some filesystems.
> Filesystem must support cross-directory rename. BEAM-4861 is related to this 
> for the case of HDFS filesystem.
> Feature was discussed here:
> http://mail-archives.apache.org/mod_mbox/beam-dev/201807.mbox/%3CCAF9t7_4Mp54pQ+vRrJrBh9Vx0=uaknupzd_qdh_qdm9vxll...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5036) Optimize FileBasedSink's WriteOperation.moveToOutput()

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5036?focusedWorklogId=150609=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150609
 ]

ASF GitHub Bot logged work on BEAM-5036:


Author: ASF GitHub Bot
Created on: 03/Oct/18 05:42
Start Date: 03/Oct/18 05:42
Worklog Time Spent: 10m 
  Work Description: timrobertson100 commented on issue #6289: [BEAM-5036] 
Optimize the FileBasedSink WriteOperation.moveToOutput()
URL: https://github.com/apache/beam/pull/6289#issuecomment-426518043
 
 
   I have tested HDFS (the whole point of this PR) and the performance is 
listed above (100 minutes drops to 37 minutes for a basic rewrite of 1.5TB avro 
file on a 10 node cluster).
   
   GCS rename does a copy() and delete() internally, so should be the same 
runtime but I agree it should be confirmed as should local


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150609)
Time Spent: 8h 40m  (was: 8.5h)

> Optimize FileBasedSink's WriteOperation.moveToOutput()
> --
>
> Key: BEAM-5036
> URL: https://issues.apache.org/jira/browse/BEAM-5036
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-files
>Affects Versions: 2.5.0
>Reporter: Jozef Vilcek
>Assignee: Tim Robertson
>Priority: Major
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> moveToOutput() methods in FileBasedSink.WriteOperation implements move by 
> copy+delete. It would be better to use a rename() which can be much more 
> effective for some filesystems.
> Filesystem must support cross-directory rename. BEAM-4861 is related to this 
> for the case of HDFS filesystem.
> Feature was discussed here:
> http://mail-archives.apache.org/mod_mbox/beam-dev/201807.mbox/%3CCAF9t7_4Mp54pQ+vRrJrBh9Vx0=uaknupzd_qdh_qdm9vxll...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5036) Optimize FileBasedSink's WriteOperation.moveToOutput()

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5036?focusedWorklogId=150608=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150608
 ]

ASF GitHub Bot logged work on BEAM-5036:


Author: ASF GitHub Bot
Created on: 03/Oct/18 05:40
Start Date: 03/Oct/18 05:40
Worklog Time Spent: 10m 
  Work Description: timrobertson100 commented on a change in pull request 
#6289: [BEAM-5036] Optimize the FileBasedSink WriteOperation.moveToOutput()
URL: https://github.com/apache/beam/pull/6289#discussion_r222187077
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/io/FileSystemsTest.java
 ##
 @@ -186,6 +189,56 @@ public void testRenameIgnoreMissingFiles() throws 
Exception {
 containsInAnyOrder("content3"));
   }
 
+  @Test
+  public void testRenameOverwriteExistingFilesThrows() throws Exception {
+// Simulate a filesystem which will not overwrite existing files
+FileSystem fs = new LocalFileSystem(new CopyOption[0]);
+thrown.expect(FileAlreadyExistsException.class);
+executeRename(fs, false);
+  }
+
+  @Test
+  public void testRenameOverwriteExistingFiles() throws Exception {
+// Simulate a filesystem which will not overwrite existing files
 
 Review comment:
   Please see above. We're testing FileSystems behavior when a FileSystem 
implementation does not behave like a LocalFileSystem.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150608)
Time Spent: 8.5h  (was: 8h 20m)

> Optimize FileBasedSink's WriteOperation.moveToOutput()
> --
>
> Key: BEAM-5036
> URL: https://issues.apache.org/jira/browse/BEAM-5036
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-files
>Affects Versions: 2.5.0
>Reporter: Jozef Vilcek
>Assignee: Tim Robertson
>Priority: Major
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> moveToOutput() methods in FileBasedSink.WriteOperation implements move by 
> copy+delete. It would be better to use a rename() which can be much more 
> effective for some filesystems.
> Filesystem must support cross-directory rename. BEAM-4861 is related to this 
> for the case of HDFS filesystem.
> Feature was discussed here:
> http://mail-archives.apache.org/mod_mbox/beam-dev/201807.mbox/%3CCAF9t7_4Mp54pQ+vRrJrBh9Vx0=uaknupzd_qdh_qdm9vxll...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5036) Optimize FileBasedSink's WriteOperation.moveToOutput()

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5036?focusedWorklogId=150607=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150607
 ]

ASF GitHub Bot logged work on BEAM-5036:


Author: ASF GitHub Bot
Created on: 03/Oct/18 05:39
Start Date: 03/Oct/18 05:39
Worklog Time Spent: 10m 
  Work Description: timrobertson100 commented on a change in pull request 
#6289: [BEAM-5036] Optimize the FileBasedSink WriteOperation.moveToOutput()
URL: https://github.com/apache/beam/pull/6289#discussion_r222186921
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/io/LocalFileSystem.java
 ##
 @@ -80,8 +82,20 @@
 class LocalFileSystem extends FileSystem {
 
   private static final Logger LOG = 
LoggerFactory.getLogger(LocalFileSystem.class);
+  private static final CopyOption[] DEFAULT_RENAME_COPY_OPTIONS =
+  new CopyOption[] {StandardCopyOption.ATOMIC_MOVE};
 
-  LocalFileSystem() {}
+  private final CopyOption[] renameCopyOption;
+
+  LocalFileSystem() {
+renameCopyOption = DEFAULT_RENAME_COPY_OPTIONS;
+  }
+
+  // Exists to allow testing behaviour of the FileSystems utility methods
+  @VisibleForTesting
+  LocalFileSystem(CopyOption[] renameCopyOption) {
 
 Review comment:
   The original tests for FileSystem already test the behavior of 
LocalFileSystem with atomic move. The other FileSystem implementations like 
HDFS do not have that behavior though. 
   
   I considered that to test FileSystems.rename we could either:
   1) Use a LocalFileSystem configured in a way to behave like other FS 
implementations (of course you would not use it in that manner normally)
   2) Have a standard test suite for FileSystems but run it in the tests of 
each FileSystem implementation
   
   I opted for approach 1) here but I get the impression that is not 
particularly clear so perhaps 2) would be better after all? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150607)
Time Spent: 8h 20m  (was: 8h 10m)

> Optimize FileBasedSink's WriteOperation.moveToOutput()
> --
>
> Key: BEAM-5036
> URL: https://issues.apache.org/jira/browse/BEAM-5036
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-files
>Affects Versions: 2.5.0
>Reporter: Jozef Vilcek
>Assignee: Tim Robertson
>Priority: Major
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> moveToOutput() methods in FileBasedSink.WriteOperation implements move by 
> copy+delete. It would be better to use a rename() which can be much more 
> effective for some filesystems.
> Filesystem must support cross-directory rename. BEAM-4861 is related to this 
> for the case of HDFS filesystem.
> Feature was discussed here:
> http://mail-archives.apache.org/mod_mbox/beam-dev/201807.mbox/%3CCAF9t7_4Mp54pQ+vRrJrBh9Vx0=uaknupzd_qdh_qdm9vxll...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5626) Several IO tests fail in Python with RuntimeError('dictionary changed size during iteration',)}

2018-10-02 Thread Valentyn Tymofieiev (JIRA)
Valentyn Tymofieiev created BEAM-5626:
-

 Summary: Several IO tests fail in Python  with 
RuntimeError('dictionary changed size during iteration',)}
 Key: BEAM-5626
 URL: https://issues.apache.org/jira/browse/BEAM-5626
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-py-core
Reporter: Valentyn Tymofieiev


 ERROR: test_delete_dir 
(apache_beam.io.hadoopfilesystem_test.HadoopFileSystemTest)
--
Traceback (most recent call last):
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/hadoopfilesystem_test.py",
 line 506, in test_delete_dir
 self.fs.delete([url_t1])
   File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/hadoopfilesystem.py",
 line 370, in delete
 raise BeamIOError("Delete operation failed", exceptions)
 apache_beam.io.filesystem.BeamIOError: Delete operation failed with exceptions 
{'hdfs://test_dir/new_dir1': RuntimeError('dictionary changed size during 
iteration',   )}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] branch asf-site updated: Publishing website 2018/10/03 05:03:09 at commit 872a368

2018-10-02 Thread git-site-role
This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new ce9ab38  Publishing website 2018/10/03 05:03:09 at commit 872a368
ce9ab38 is described below

commit ce9ab383d82cb6cea9e37349bb2bc0fd07909561
Author: jenkins 
AuthorDate: Wed Oct 3 05:03:09 2018 +

Publishing website 2018/10/03 05:03:09 at commit 872a368



[jira] [Work logged] (BEAM-5619) Fix minor bug in JdbcIO example code

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5619?focusedWorklogId=150604=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150604
 ]

ASF GitHub Bot logged work on BEAM-5619:


Author: ASF GitHub Bot
Created on: 03/Oct/18 05:01
Start Date: 03/Oct/18 05:01
Worklog Time Spent: 10m 
  Work Description: jbonofre closed pull request #6553: [BEAM-5619] Fix 
minor bug in JdbcIO example code
URL: https://github.com/apache/beam/pull/6553
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java 
b/sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java
index 6cb5c932338..bbb4ded6aa6 100644
--- a/sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java
+++ b/sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java
@@ -136,8 +136,8 @@
  *  .withPreparedStatementSetter(new 
JdbcIO.PreparedStatementSetter>() {
  *public void setParameters(KV element, 
PreparedStatement query)
  *  throws SQLException {
- *  query.setInt(1, kv.getKey());
- *  query.setString(2, kv.getValue());
+ *  query.setInt(1, element.getKey());
+ *  query.setString(2, element.getValue());
  *}
  *  })
  *);


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150604)
Time Spent: 20m  (was: 10m)

> Fix minor bug in JdbcIO example code
> 
>
> Key: BEAM-5619
> URL: https://issues.apache.org/jira/browse/BEAM-5619
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jdbc
>Reporter: Kengo Seki
>Assignee: Kengo Seki
>Priority: Trivial
> Fix For: 2.8.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> There's a minor bug in JdbcIO Javadoc:
> {code}
> 127  * {@code
> 128  * pipeline
> 129  *   .apply(...)
> 130  *   .apply(JdbcIO.>write()
> 131  *  
> .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
> 132  *"com.mysql.jdbc.Driver", "jdbc:mysql://hostname:3306/mydb")
> 133  *  .withUsername("username")
> 134  *  .withPassword("password"))
> 135  *  .withStatement("insert into Person values(?, ?)")
> 136  *  .withPreparedStatementSetter(new 
> JdbcIO.PreparedStatementSetter>() {
> 137  *public void setParameters(KV element, 
> PreparedStatement query)
> 138  *  throws SQLException {
> 139  *  query.setInt(1, kv.getKey());
> 140  *  query.setString(2, kv.getValue());
> 141  *}
> 142  *  })
> 143  *);
> 144  * }
> {code}
> {{kv}} at the line 139 and 140 should be {{element}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5619) Fix minor bug in JdbcIO example code

2018-10-02 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/BEAM-5619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Baptiste Onofré resolved BEAM-5619.

   Resolution: Fixed
Fix Version/s: 2.8.0

> Fix minor bug in JdbcIO example code
> 
>
> Key: BEAM-5619
> URL: https://issues.apache.org/jira/browse/BEAM-5619
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jdbc
>Reporter: Kengo Seki
>Assignee: Kengo Seki
>Priority: Trivial
> Fix For: 2.8.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> There's a minor bug in JdbcIO Javadoc:
> {code}
> 127  * {@code
> 128  * pipeline
> 129  *   .apply(...)
> 130  *   .apply(JdbcIO.>write()
> 131  *  
> .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
> 132  *"com.mysql.jdbc.Driver", "jdbc:mysql://hostname:3306/mydb")
> 133  *  .withUsername("username")
> 134  *  .withPassword("password"))
> 135  *  .withStatement("insert into Person values(?, ?)")
> 136  *  .withPreparedStatementSetter(new 
> JdbcIO.PreparedStatementSetter>() {
> 137  *public void setParameters(KV element, 
> PreparedStatement query)
> 138  *  throws SQLException {
> 139  *  query.setInt(1, kv.getKey());
> 140  *  query.setString(2, kv.getValue());
> 141  *}
> 142  *  })
> 143  *);
> 144  * }
> {code}
> {{kv}} at the line 139 and 140 should be {{element}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] 01/01: Merge pull request #6553 from sekikn/BEAM-5619

2018-10-02 Thread jbonofre
This is an automated email from the ASF dual-hosted git repository.

jbonofre pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 872a3686ecce503ab87af100b515024917238d8d
Merge: c48f429 31dbc14
Author: Jean-Baptiste Onofré 
AuthorDate: Wed Oct 3 07:01:19 2018 +0200

Merge pull request #6553 from sekikn/BEAM-5619

[BEAM-5619] Fix minor bug in JdbcIO example code

 .../io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)



[beam] branch master updated (c48f429 -> 872a368)

2018-10-02 Thread jbonofre
This is an automated email from the ASF dual-hosted git repository.

jbonofre pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from c48f429  Merge pull request #6528: [BEAM-5534] TopWikipediaSessionsIT
 add 31dbc14  [BEAM-5619] Fix minor bug in JdbcIO example code
 new 872a368  Merge pull request #6553 from sekikn/BEAM-5619

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)



[jira] [Work logged] (BEAM-5625) PortableRunner.PipelineResult.cancel not working

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5625?focusedWorklogId=150598=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150598
 ]

ASF GitHub Bot logged work on BEAM-5625:


Author: ASF GitHub Bot
Created on: 03/Oct/18 04:25
Start Date: 03/Oct/18 04:25
Worklog Time Spent: 10m 
  Work Description: jcruelty commented on issue #6554: [BEAM-5625] Add 
missing CancelJobRequest parameter
URL: https://github.com/apache/beam/pull/6554#issuecomment-426507664
 
 
   +1


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150598)
Time Spent: 0.5h  (was: 20m)

> PortableRunner.PipelineResult.cancel not working
> 
>
> Key: BEAM-5625
> URL: https://issues.apache.org/jira/browse/BEAM-5625
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Major
> Fix For: 2.8.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Fails due to missing CancelJobRequest parameter:
>     self._job_service.Cancel()
> TypeError: __call__() takes at least 2 arguments (1 given)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5625) PortableRunner.PipelineResult.cancel not working

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5625?focusedWorklogId=150597=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150597
 ]

ASF GitHub Bot logged work on BEAM-5625:


Author: ASF GitHub Bot
Created on: 03/Oct/18 04:23
Start Date: 03/Oct/18 04:23
Worklog Time Spent: 10m 
  Work Description: rakeshcusat commented on issue #6554: [BEAM-5625] Add 
missing CancelJobRequest parameter
URL: https://github.com/apache/beam/pull/6554#issuecomment-426507374
 
 
    


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150597)
Time Spent: 20m  (was: 10m)

> PortableRunner.PipelineResult.cancel not working
> 
>
> Key: BEAM-5625
> URL: https://issues.apache.org/jira/browse/BEAM-5625
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Major
> Fix For: 2.8.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Fails due to missing CancelJobRequest parameter:
>     self._job_service.Cancel()
> TypeError: __call__() takes at least 2 arguments (1 given)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5625) PortableRunner.PipelineResult.cancel not working

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5625?focusedWorklogId=150595=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150595
 ]

ASF GitHub Bot logged work on BEAM-5625:


Author: ASF GitHub Bot
Created on: 03/Oct/18 04:17
Start Date: 03/Oct/18 04:17
Worklog Time Spent: 10m 
  Work Description: tweise opened a new pull request #6554: [BEAM-5625] Add 
missing CancelJobRequest parameter
URL: https://github.com/apache/beam/pull/6554
 
 
   Add missing CancelJobRequest parameter for job_service.Cancel call.
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/)
 | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150595)
Time Spent: 10m
Remaining Estimate: 0h

> PortableRunner.PipelineResult.cancel not working
> 
>
> Key: BEAM-5625
> URL: https://issues.apache.org/jira/browse/BEAM-5625
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: 

[jira] [Created] (BEAM-5625) PortableRunner.PipelineResult.cancel not working

2018-10-02 Thread Thomas Weise (JIRA)
Thomas Weise created BEAM-5625:
--

 Summary: PortableRunner.PipelineResult.cancel not working
 Key: BEAM-5625
 URL: https://issues.apache.org/jira/browse/BEAM-5625
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-harness
Reporter: Thomas Weise
Assignee: Thomas Weise
 Fix For: 2.8.0


Fails due to missing CancelJobRequest parameter:

    self._job_service.Cancel()

TypeError: __call__() takes at least 2 arguments (1 given)

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5624) Avro IO does not work with avro-python3 package out-of-the-box on Python 3, several tests fail with AttributeError (module 'avro.schema' has no attribute 'parse')

2018-10-02 Thread Valentyn Tymofieiev (JIRA)
Valentyn Tymofieiev created BEAM-5624:
-

 Summary: Avro IO does not work with avro-python3 package 
out-of-the-box on Python 3, several tests fail with AttributeError (module 
'avro.schema' has no attribute 'parse') 
 Key: BEAM-5624
 URL: https://issues.apache.org/jira/browse/BEAM-5624
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-py-core
Reporter: Valentyn Tymofieiev


==
ERROR: Failure: AttributeError (module 'avro.schema' has no attribute 'parse')
--
Traceback (most recent call last):
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/target/.tox/py3/lib/python3.5/site-packages/nose/failure.py",
 line 39, in runTest
raise self.exc_val.with_traceback(self.tb)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/target/.tox/py3/lib/python3.5/site-packages/nose/loader.py",
 line 418, in loadTestsFromName
addr.filename, addr.module)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/target/.tox/py3/lib/python3.5/site-packages/nose/importer.py",
 line 47, in importFromPath
return self.importFromDir(dir_path, fqname)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/target/.tox/py3/lib/python3.5/site-packages/nose/importer.py",
 line 94, in importFromDir
mod = load_module(part_fqname, fh, filename, desc)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/target/.tox/py3/lib/python3.5/imp.py",
 line 234, in load_module
return load_source(name, filename, file)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/target/.tox/py3/lib/python3.5/imp.py",
 line 172, in load_source
module = _load(spec)
  File "", line 693, in _load
  File "", line 673, in _load_unlocked
  File "", line 673, in exec_module
  File "", line 222, in _call_with_frames_removed
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/avroio_test.py",
 line 54, in 
class TestAvro(unittest.TestCase):
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/avroio_test.py",
 line 89, in TestAvro
SCHEMA = avro.schema.parse('''
AttributeError: module 'avro.schema' has no attribute 'parse'

Note that we use a different implementation of avro/avro-python3 package 
depending on Python version. We are also evaluating potential replacement of 
avro with fastavro.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5623) Several tests IO tests hang indefinitely during execution on Python 3.

2018-10-02 Thread Valentyn Tymofieiev (JIRA)
Valentyn Tymofieiev created BEAM-5623:
-

 Summary: Several tests IO tests hang indefinitely during execution 
on Python 3.
 Key: BEAM-5623
 URL: https://issues.apache.org/jira/browse/BEAM-5623
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-py-core
Reporter: Valentyn Tymofieiev


test_read_empty_single_file_no_eol_gzip 
(apache_beam.io.textio_test.TextSourceTest) 

Also several tests cases in tfrecordio_test, for example:

test_process_auto (apache_beam.io.tfrecordio_test.TestReadAllFromTFRecord)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5618) Several tests fail on Python 3 with: unsupported operand type(s) for +: 'int' and 'EmptySideInput'

2018-10-02 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636376#comment-16636376
 ] 

Valentyn Tymofieiev commented on BEAM-5618:
---

ERROR: test_as_singleton_without_unique_labels 
(apache_beam.transforms.sideinputs_test.SideInputsTest)
--
Traceback (most recent call last):
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
 line 677, in process
self.do_fn_invoker.invoke_process(windowed_value)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
 line 414, in invoke_process
windowed_value, self.process_method(windowed_value.value))
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
 line 1068, in 
wrapper = lambda x: [fn(x)]
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/sideinputs_test.py",
 line 224, in match
equal_to([expected_singleton])([actual_singleton1])
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/testing/util.py",
 line 119, in _equal
'Failed assert: %r == %r' % (sorted_expected, sorted_actual))
apache_beam.testing.util.BeamAssertException: Failed assert: [2] == 
[]



> Several tests fail on Python 3 with: unsupported operand type(s) for +: 'int' 
> and 'EmptySideInput'
> --
>
> Key: BEAM-5618
> URL: https://issues.apache.org/jira/browse/BEAM-5618
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Priority: Major
>
> ERROR: test_do_with_side_input_as_arg 
> (apache_beam.transforms.ptransform_test.PTransformTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 677, in process
> self.do_fn_invoker.invoke_process(windowed_value)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 529, in invoke_process
> windowed_value, additional_args, additional_kwargs, output_processor)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 598, in _invoke_per_window
> windowed_value, self.process_method(*args_for_process))
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/ptransform_test.py",
>  line 135, in 
> lambda x, addon: [x + addon], pvalue.AsSingleton(side))
> TypeError: unsupported operand type(s) for +: 'int' and 'EmptySideInput'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5621) Several tests fail on Python 3 with TypeError: unorderable types: str() < int()

2018-10-02 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636375#comment-16636375
 ] 

Valentyn Tymofieiev commented on BEAM-5621:
---

A few other examples:

==
ERROR: test_target_duration (apache_beam.transforms.util_test.BatchElementsTest)
--
Traceback (most recent call last):
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/util_test.py",
 line 103, in test_target_duration
target_batch_overhead=None, target_batch_duration_secs=10, clock=clock)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/util.py",
 line 226, in __init__
if max(0, target_batch_overhead, target_batch_duration_secs) == 0:
TypeError: unorderable types: NoneType() > int()

==
ERROR: test_target_overhead (apache_beam.transforms.util_test.BatchElementsTest)
--
Traceback (most recent call last):
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/util_test.py",
 line 117, in test_target_overhead
target_batch_overhead=.05, target_batch_duration_secs=None, clock=clock)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/util.py",
 line 226, in __init__
if max(0, target_batch_overhead, target_batch_duration_secs) == 0:
TypeError: unorderable types: NoneType() > float()

==
ERROR: test_reshuffle_window_fn_preserved 
(apache_beam.transforms.util_test.ReshuffleTest)
--
Traceback (most recent call last):
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
 line 677, in process
self.do_fn_invoker.invoke_process(windowed_value)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
 line 414, in invoke_process
windowed_value, self.process_method(windowed_value.value))
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
 line 1068, in 
wrapper = lambda x: [fn(x)]
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/testing/util.py",
 line 115, in _equal
sorted_expected = sorted(expected)
TypeError: unorderable types: list() < InAnyOrder()


> Several tests fail on Python 3 with TypeError: unorderable types: str() < 
> int()
> ---
>
> Key: BEAM-5621
> URL: https://issues.apache.org/jira/browse/BEAM-5621
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Priority: Major
>
> ==
> ERROR: test_remove_duplicates 
> (apache_beam.transforms.ptransform_test.PTransformTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 677, in process
> self.do_fn_invoker.invoke_process(windowed_value)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 414, in invoke_process
> windowed_value, self.process_method(windowed_value.value))
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
>  line 1068, in 
> wrapper = lambda x: [fn(x)]
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/testing/util.py",
>  line 115, in _equal
> sorted_expected = sorted(expected)
> TypeError: unorderable types: str() < int()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5622) Several tests fail on Python 3 with: Runtime type violation detected

2018-10-02 Thread Valentyn Tymofieiev (JIRA)
Valentyn Tymofieiev created BEAM-5622:
-

 Summary: Several tests fail on Python 3 with: Runtime type 
violation detected
 Key: BEAM-5622
 URL: https://issues.apache.org/jira/browse/BEAM-5622
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-py-core
Reporter: Valentyn Tymofieiev


==
FAIL: test_combine_runtime_type_check_violation_using_decorators 
(apache_beam.transforms.ptransform_test.PTransformTypeCheckTestCase)
--
Traceback (most recent call last):
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/ptransform_test.py",
 line 1543, in test_combine_runtime_type_check_violation_using_decorators
"Runtime type violation detected within "
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/ptransform_test.py",
 line 911, in assertStartswith
'"%s" does not start with "%s"' % (msg, prefix))
AssertionError: False is not true : "Runtime type violation detected within 
Mul/CombinePerKey: Type-hint for return type violated. Expected an instance of 
, instead found 
25252525252525252525252525252525252525252525252525, an instance of ." does not start with "Runtime type violation detected within 
Mul/CombinePerKey: Type-hint for return type violated. Expected an instance of 
, instead found"

==
FAIL: test_combine_runtime_type_check_violation_using_methods 
(apache_beam.transforms.ptransform_test.PTransformTypeCheckTestCase)
--
Traceback (most recent call last):
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/ptransform_test.py",
 line 1597, in test_combine_runtime_type_check_violation_using_methods
"Runtime type violation detected within "
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/ptransform_test.py",
 line 911, in assertStartswith
'"%s" does not start with "%s"' % (msg, prefix))
AssertionError: False is not true : "Runtime type violation detected within 
ParDo(SortJoin/KeyWithVoid): Type-hint for argument: 'v' violated. Expected an 
instance of , instead found 0, an instance of . 
[while running 'SortJoin/KeyWithVoid']" does not start with "Runtime type 
violation detected within ParDo(SortJoin/KeyWithVoid): Type-hint for argument: 
'v' violated. Expected an instance of , instead found 0, an 
instance of ."





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5036) Optimize FileBasedSink's WriteOperation.moveToOutput()

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5036?focusedWorklogId=150593=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150593
 ]

ASF GitHub Bot logged work on BEAM-5036:


Author: ASF GitHub Bot
Created on: 03/Oct/18 03:02
Start Date: 03/Oct/18 03:02
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #6289: [BEAM-5036] 
Optimize the FileBasedSink WriteOperation.moveToOutput()
URL: https://github.com/apache/beam/pull/6289#issuecomment-426496924
 
 
   We have several performance tests for file-based IO that are run regularly 
through Jenkins for various file-systems: 
https://builds.apache.org/view/A-D/view/Beam/
   
   So probably you can try some of those tests.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150593)
Time Spent: 8h 10m  (was: 8h)

> Optimize FileBasedSink's WriteOperation.moveToOutput()
> --
>
> Key: BEAM-5036
> URL: https://issues.apache.org/jira/browse/BEAM-5036
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-files
>Affects Versions: 2.5.0
>Reporter: Jozef Vilcek
>Assignee: Tim Robertson
>Priority: Major
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> moveToOutput() methods in FileBasedSink.WriteOperation implements move by 
> copy+delete. It would be better to use a rename() which can be much more 
> effective for some filesystems.
> Filesystem must support cross-directory rename. BEAM-4861 is related to this 
> for the case of HDFS filesystem.
> Feature was discussed here:
> http://mail-archives.apache.org/mod_mbox/beam-dev/201807.mbox/%3CCAF9t7_4Mp54pQ+vRrJrBh9Vx0=uaknupzd_qdh_qdm9vxll...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-5615) Several tests fail on Python 3 with TypeError: 'cmp' is an invalid keyword argument for this function

2018-10-02 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636370#comment-16636370
 ] 

Valentyn Tymofieiev edited comment on BEAM-5615 at 10/3/18 3:01 AM:


Possibly related: https://issues.apache.org/jira/browse/BEAM-5620


was (Author: tvalentyn):
Possibly related: https://issues.apache.org/jira/browse/BEAM-5615

> Several tests fail on Python 3 with TypeError: 'cmp' is an invalid keyword 
> argument for this function
> -
>
> Key: BEAM-5615
> URL: https://issues.apache.org/jira/browse/BEAM-5615
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-harness
>Reporter: Valentyn Tymofieiev
>Priority: Major
>
> ERROR: test_top (apache_beam.transforms.combiners_test.CombineTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners_test.py",
>  line 89, in test_top
> names)  # Note parameter passed to comparator.
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pvalue.py",
>  line 111, in __or__
> return self.pipeline.apply(ptransform, self)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 467, in apply
> label or transform.label)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 477, in apply
> return self.apply(transform, pvalueish)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 513, in apply
> pvalueish_result = self.runner.apply(transform, pvalueish)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 193, in apply
> return m(transform, input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 199, in apply_PTransform
> return transform.expand(input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/ptransform.py",
>  line 759, in expand
> return self._fn(pcoll, *args, **kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners.py",
>  line 185, in Of
> TopCombineFn(n, compare, key, reverse), *args, **kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pvalue.py",
>  line 111, in __or__
> return self.pipeline.apply(ptransform, self)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 513, in apply
> pvalueish_result = self.runner.apply(transform, pvalueish)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 193, in apply
> return m(transform, input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 199, in apply_PTransform
> return transform.expand(input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
>  line 1251, in expand
> default_value = combine_fn.apply([], *self.args, **self.kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
>  line 623, in apply
> *args, **kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners.py",
>  line 362, in extract_output
> self._sort_buffer(buffer, lt)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners.py",
>  line 295, in _sort_buffer
> key=self._key_fn)
> TypeError: 'cmp' is an invalid keyword argument for this function



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5621) Several tests fail on Python 3 with TypeError: unorderable types: str() < int()

2018-10-02 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636371#comment-16636371
 ] 

Valentyn Tymofieiev commented on BEAM-5621:
---

Possibly related: https://issues.apache.org/jira/browse/BEAM-5615

> Several tests fail on Python 3 with TypeError: unorderable types: str() < 
> int()
> ---
>
> Key: BEAM-5621
> URL: https://issues.apache.org/jira/browse/BEAM-5621
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Priority: Major
>
> ==
> ERROR: test_remove_duplicates 
> (apache_beam.transforms.ptransform_test.PTransformTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 677, in process
> self.do_fn_invoker.invoke_process(windowed_value)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 414, in invoke_process
> windowed_value, self.process_method(windowed_value.value))
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
>  line 1068, in 
> wrapper = lambda x: [fn(x)]
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/testing/util.py",
>  line 115, in _equal
> sorted_expected = sorted(expected)
> TypeError: unorderable types: str() < int()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5615) Several tests fail on Python 3 with TypeError: 'cmp' is an invalid keyword argument for this function

2018-10-02 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636370#comment-16636370
 ] 

Valentyn Tymofieiev commented on BEAM-5615:
---

Possibly related: https://issues.apache.org/jira/browse/BEAM-5615

> Several tests fail on Python 3 with TypeError: 'cmp' is an invalid keyword 
> argument for this function
> -
>
> Key: BEAM-5615
> URL: https://issues.apache.org/jira/browse/BEAM-5615
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-harness
>Reporter: Valentyn Tymofieiev
>Priority: Major
>
> ERROR: test_top (apache_beam.transforms.combiners_test.CombineTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners_test.py",
>  line 89, in test_top
> names)  # Note parameter passed to comparator.
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pvalue.py",
>  line 111, in __or__
> return self.pipeline.apply(ptransform, self)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 467, in apply
> label or transform.label)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 477, in apply
> return self.apply(transform, pvalueish)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 513, in apply
> pvalueish_result = self.runner.apply(transform, pvalueish)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 193, in apply
> return m(transform, input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 199, in apply_PTransform
> return transform.expand(input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/ptransform.py",
>  line 759, in expand
> return self._fn(pcoll, *args, **kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners.py",
>  line 185, in Of
> TopCombineFn(n, compare, key, reverse), *args, **kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pvalue.py",
>  line 111, in __or__
> return self.pipeline.apply(ptransform, self)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 513, in apply
> pvalueish_result = self.runner.apply(transform, pvalueish)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 193, in apply
> return m(transform, input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 199, in apply_PTransform
> return transform.expand(input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
>  line 1251, in expand
> default_value = combine_fn.apply([], *self.args, **self.kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
>  line 623, in apply
> *args, **kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners.py",
>  line 362, in extract_output
> self._sort_buffer(buffer, lt)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners.py",
>  line 295, in _sort_buffer
> key=self._key_fn)
> TypeError: 'cmp' is an invalid keyword argument for this function



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-5615) Several tests fail on Python 3 with TypeError: 'cmp' is an invalid keyword argument for this function

2018-10-02 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636370#comment-16636370
 ] 

Valentyn Tymofieiev edited comment on BEAM-5615 at 10/3/18 3:01 AM:


Possibly related: https://issues.apache.org/jira/browse/BEAM-5621


was (Author: tvalentyn):
Possibly related: https://issues.apache.org/jira/browse/BEAM-5620

> Several tests fail on Python 3 with TypeError: 'cmp' is an invalid keyword 
> argument for this function
> -
>
> Key: BEAM-5615
> URL: https://issues.apache.org/jira/browse/BEAM-5615
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-harness
>Reporter: Valentyn Tymofieiev
>Priority: Major
>
> ERROR: test_top (apache_beam.transforms.combiners_test.CombineTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners_test.py",
>  line 89, in test_top
> names)  # Note parameter passed to comparator.
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pvalue.py",
>  line 111, in __or__
> return self.pipeline.apply(ptransform, self)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 467, in apply
> label or transform.label)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 477, in apply
> return self.apply(transform, pvalueish)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 513, in apply
> pvalueish_result = self.runner.apply(transform, pvalueish)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 193, in apply
> return m(transform, input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 199, in apply_PTransform
> return transform.expand(input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/ptransform.py",
>  line 759, in expand
> return self._fn(pcoll, *args, **kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners.py",
>  line 185, in Of
> TopCombineFn(n, compare, key, reverse), *args, **kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pvalue.py",
>  line 111, in __or__
> return self.pipeline.apply(ptransform, self)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 513, in apply
> pvalueish_result = self.runner.apply(transform, pvalueish)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 193, in apply
> return m(transform, input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 199, in apply_PTransform
> return transform.expand(input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
>  line 1251, in expand
> default_value = combine_fn.apply([], *self.args, **self.kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
>  line 623, in apply
> *args, **kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners.py",
>  line 362, in extract_output
> self._sort_buffer(buffer, lt)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners.py",
>  line 295, in _sort_buffer
> key=self._key_fn)
> TypeError: 'cmp' is an invalid keyword argument for this function



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5036) Optimize FileBasedSink's WriteOperation.moveToOutput()

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5036?focusedWorklogId=150592=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150592
 ]

ASF GitHub Bot logged work on BEAM-5036:


Author: ASF GitHub Bot
Created on: 03/Oct/18 03:00
Start Date: 03/Oct/18 03:00
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#6289: [BEAM-5036] Optimize the FileBasedSink WriteOperation.moveToOutput()
URL: https://github.com/apache/beam/pull/6289#discussion_r222169404
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/io/FileSystemsTest.java
 ##
 @@ -186,6 +189,56 @@ public void testRenameIgnoreMissingFiles() throws 
Exception {
 containsInAnyOrder("content3"));
   }
 
+  @Test
+  public void testRenameOverwriteExistingFilesThrows() throws Exception {
+// Simulate a filesystem which will not overwrite existing files
+FileSystem fs = new LocalFileSystem(new CopyOption[0]);
+thrown.expect(FileAlreadyExistsException.class);
+executeRename(fs, false);
+  }
+
+  @Test
+  public void testRenameOverwriteExistingFiles() throws Exception {
+// Simulate a filesystem which will not overwrite existing files
 
 Review comment:
   Why do we want tests to be different from production ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150592)
Time Spent: 8h  (was: 7h 50m)

> Optimize FileBasedSink's WriteOperation.moveToOutput()
> --
>
> Key: BEAM-5036
> URL: https://issues.apache.org/jira/browse/BEAM-5036
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-files
>Affects Versions: 2.5.0
>Reporter: Jozef Vilcek
>Assignee: Tim Robertson
>Priority: Major
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> moveToOutput() methods in FileBasedSink.WriteOperation implements move by 
> copy+delete. It would be better to use a rename() which can be much more 
> effective for some filesystems.
> Filesystem must support cross-directory rename. BEAM-4861 is related to this 
> for the case of HDFS filesystem.
> Feature was discussed here:
> http://mail-archives.apache.org/mod_mbox/beam-dev/201807.mbox/%3CCAF9t7_4Mp54pQ+vRrJrBh9Vx0=uaknupzd_qdh_qdm9vxll...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5036) Optimize FileBasedSink's WriteOperation.moveToOutput()

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5036?focusedWorklogId=150591=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150591
 ]

ASF GitHub Bot logged work on BEAM-5036:


Author: ASF GitHub Bot
Created on: 03/Oct/18 03:00
Start Date: 03/Oct/18 03:00
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#6289: [BEAM-5036] Optimize the FileBasedSink WriteOperation.moveToOutput()
URL: https://github.com/apache/beam/pull/6289#discussion_r222168964
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/io/LocalFileSystem.java
 ##
 @@ -80,8 +82,20 @@
 class LocalFileSystem extends FileSystem {
 
   private static final Logger LOG = 
LoggerFactory.getLogger(LocalFileSystem.class);
+  private static final CopyOption[] DEFAULT_RENAME_COPY_OPTIONS =
+  new CopyOption[] {StandardCopyOption.ATOMIC_MOVE};
 
 Review comment:
   Sorry, what I meant was just to do Files.move(src.getPath(), dst.getPath(), 
ImmutableList.of(renameCopyOption)) unless there's a reason to maintain a list 
here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150591)

> Optimize FileBasedSink's WriteOperation.moveToOutput()
> --
>
> Key: BEAM-5036
> URL: https://issues.apache.org/jira/browse/BEAM-5036
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-files
>Affects Versions: 2.5.0
>Reporter: Jozef Vilcek
>Assignee: Tim Robertson
>Priority: Major
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> moveToOutput() methods in FileBasedSink.WriteOperation implements move by 
> copy+delete. It would be better to use a rename() which can be much more 
> effective for some filesystems.
> Filesystem must support cross-directory rename. BEAM-4861 is related to this 
> for the case of HDFS filesystem.
> Feature was discussed here:
> http://mail-archives.apache.org/mod_mbox/beam-dev/201807.mbox/%3CCAF9t7_4Mp54pQ+vRrJrBh9Vx0=uaknupzd_qdh_qdm9vxll...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5621) Several tests fail on Python 3 with TypeError: unorderable types: str() < int()

2018-10-02 Thread Valentyn Tymofieiev (JIRA)
Valentyn Tymofieiev created BEAM-5621:
-

 Summary: Several tests fail on Python 3 with TypeError: 
unorderable types: str() < int()
 Key: BEAM-5621
 URL: https://issues.apache.org/jira/browse/BEAM-5621
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-py-core
Reporter: Valentyn Tymofieiev


==
ERROR: test_remove_duplicates 
(apache_beam.transforms.ptransform_test.PTransformTest)
--
Traceback (most recent call last):
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
 line 677, in process
self.do_fn_invoker.invoke_process(windowed_value)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
 line 414, in invoke_process
windowed_value, self.process_method(windowed_value.value))
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
 line 1068, in 
wrapper = lambda x: [fn(x)]
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/testing/util.py",
 line 115, in _equal
sorted_expected = sorted(expected)
TypeError: unorderable types: str() < int()




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5620) Some tests use assertItemsEqual method, not available in Python 3

2018-10-02 Thread Valentyn Tymofieiev (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev reassigned BEAM-5620:
-

Assignee: (was: Ahmet Altay)

> Some tests use assertItemsEqual method, not available in Python 3
> -
>
> Key: BEAM-5620
> URL: https://issues.apache.org/jira/browse/BEAM-5620
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Priority: Major
>
> See: 
> https://github.com/apache/beam/search?q=assertItemsEqual_q=assertItemsEqual



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5612) Add tox suites to exercise unit tests using Python3 interpreter with cython, and with gcp dependencies.

2018-10-02 Thread Valentyn Tymofieiev (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev reassigned BEAM-5612:
-

Assignee: (was: Ahmet Altay)

> Add tox suites to exercise unit tests using Python3 interpreter with cython, 
> and with gcp dependencies.
> ---
>
> Key: BEAM-5612
> URL: https://issues.apache.org/jira/browse/BEAM-5612
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5620) Some tests use assertItemsEqual method, not available in Python 3

2018-10-02 Thread Valentyn Tymofieiev (JIRA)
Valentyn Tymofieiev created BEAM-5620:
-

 Summary: Some tests use assertItemsEqual method, not available in 
Python 3
 Key: BEAM-5620
 URL: https://issues.apache.org/jira/browse/BEAM-5620
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-py-core
Reporter: Valentyn Tymofieiev
Assignee: Ahmet Altay


See: 
https://github.com/apache/beam/search?q=assertItemsEqual_q=assertItemsEqual



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5619) Fix minor bug in JdbcIO example code

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5619?focusedWorklogId=150588=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150588
 ]

ASF GitHub Bot logged work on BEAM-5619:


Author: ASF GitHub Bot
Created on: 03/Oct/18 02:51
Start Date: 03/Oct/18 02:51
Worklog Time Spent: 10m 
  Work Description: sekikn opened a new pull request #6553: [BEAM-5619] Fix 
minor bug in JdbcIO example code
URL: https://github.com/apache/beam/pull/6553
 
 
   **Please** add a meaningful description for your change here
   
   Example code for writing to JDBC datasource in JdbcIO Javadoc doesn't work 
as it is now. This PR fixes it.
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [x] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/)
 | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150588)
Time Spent: 10m
Remaining Estimate: 0h

> Fix minor bug in JdbcIO example code
> 
>
> Key: BEAM-5619
> URL: https://issues.apache.org/jira/browse/BEAM-5619
> Project: Beam
> 

[jira] [Work logged] (BEAM-5339) Implement new policy on Beam dependency tooling

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5339?focusedWorklogId=150587=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150587
 ]

ASF GitHub Bot logged work on BEAM-5339:


Author: ASF GitHub Bot
Created on: 03/Oct/18 02:49
Start Date: 03/Oct/18 02:49
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #554: [BEAM-5339] update 
the beam dependency guide
URL: https://github.com/apache/beam-site/pull/554#issuecomment-426495115
 
 
   @asfgit merge


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150587)
Time Spent: 7h 10m  (was: 7h)

> Implement new policy on Beam dependency tooling
> ---
>
> Key: BEAM-5339
> URL: https://issues.apache.org/jira/browse/BEAM-5339
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
> Fix For: 2.8.0
>
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> (1) Instead of a dependency "owners" list we will be maintaining an 
> "interested parties" list. When we create a JIRA for a dependency we will not 
> assign it to an owner but rather we will CC all the folks that mentioned that 
> they will be interested in receiving updates related to that dependency. Hope 
> is that some of the interested parties will also put forward the effort to 
> upgrade dependencies they are interested in but the responsibility of 
> upgrading dependencies lie with the community as a whole.
>  (2) We will be creating JIRAs for upgrading individual dependencies, not for 
> upgrading to specific versions of those dependencies. For example, if a given 
> dependency X is three minor versions or an year behind we will create a JIRA 
> for upgrading that. But the specific version to upgrade to has to be 
> determined by the Beam community. Beam community might choose to close a JIRA 
> if there are known issues with available recent releases. Tool may reopen 
> such a closed JIRA in the future if new information becomes available (for 
> example, 3 new versions have been released since JIRA was closed)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5619) Fix minor bug in JdbcIO example code

2018-10-02 Thread Kengo Seki (JIRA)
Kengo Seki created BEAM-5619:


 Summary: Fix minor bug in JdbcIO example code
 Key: BEAM-5619
 URL: https://issues.apache.org/jira/browse/BEAM-5619
 Project: Beam
  Issue Type: Bug
  Components: io-java-jdbc
Reporter: Kengo Seki
Assignee: Kengo Seki


There's a minor bug in JdbcIO Javadoc:

{code}
127  * {@code
128  * pipeline
129  *   .apply(...)
130  *   .apply(JdbcIO.>write()
131  *  .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
132  *"com.mysql.jdbc.Driver", "jdbc:mysql://hostname:3306/mydb")
133  *  .withUsername("username")
134  *  .withPassword("password"))
135  *  .withStatement("insert into Person values(?, ?)")
136  *  .withPreparedStatementSetter(new 
JdbcIO.PreparedStatementSetter>() {
137  *public void setParameters(KV element, 
PreparedStatement query)
138  *  throws SQLException {
139  *  query.setInt(1, kv.getKey());
140  *  query.setString(2, kv.getValue());
141  *}
142  *  })
143  *);
144  * }
{code}

{{kv}} at the line 139 and 140 should be {{element}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5618) Several tests fail on Python 3 with: unsupported operand type(s) for +: 'int' and 'EmptySideInput'

2018-10-02 Thread Valentyn Tymofieiev (JIRA)
Valentyn Tymofieiev created BEAM-5618:
-

 Summary: Several tests fail on Python 3 with: unsupported operand 
type(s) for +: 'int' and 'EmptySideInput'
 Key: BEAM-5618
 URL: https://issues.apache.org/jira/browse/BEAM-5618
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-py-core
Reporter: Valentyn Tymofieiev


ERROR: test_do_with_side_input_as_arg 
(apache_beam.transforms.ptransform_test.PTransformTest)
--
Traceback (most recent call last):
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
 line 677, in process
self.do_fn_invoker.invoke_process(windowed_value)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
 line 529, in invoke_process
windowed_value, additional_args, additional_kwargs, output_processor)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
 line 598, in _invoke_per_window
windowed_value, self.process_method(*args_for_process))
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/ptransform_test.py",
 line 135, in 
lambda x, addon: [x + addon], pvalue.AsSingleton(side))
TypeError: unsupported operand type(s) for +: 'int' and 'EmptySideInput'





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5617) Several SideInput tests fail on Python 3 with failed assert: == []

2018-10-02 Thread Valentyn Tymofieiev (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-5617:
--
Summary: Several SideInput tests fail on Python 3 with failed assert:  == []  (was: Several SideInput tests fail on Python 3 with 
failed assert:  == [])

> Several SideInput tests fail on Python 3 with failed assert:  list> == []
> 
>
> Key: BEAM-5617
> URL: https://issues.apache.org/jira/browse/BEAM-5617
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Priority: Major
>
> ==
> ERROR: test_iterable_side_input 
> (apache_beam.transforms.sideinputs_test.SideInputsTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 677, in process
> self.do_fn_invoker.invoke_process(windowed_value)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 414, in invoke_process
> windowed_value, self.process_method(windowed_value.value))
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
>  line 1068, in 
> wrapper = lambda x: [fn(x)]
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/testing/util.py",
>  line 119, in _equal
> 'Failed assert: %r == %r' % (sorted_expected, sorted_actual))
> apache_beam.testing.util.BeamAssertException: Failed assert: [3, 4, 6, 8] == 
> []



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5617) Several SideInput tests fail on Python 3 with failed assert: == []

2018-10-02 Thread Valentyn Tymofieiev (JIRA)
Valentyn Tymofieiev created BEAM-5617:
-

 Summary: Several SideInput tests fail on Python 3 with failed 
assert:  == []
 Key: BEAM-5617
 URL: https://issues.apache.org/jira/browse/BEAM-5617
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-py-core
Reporter: Valentyn Tymofieiev


==
ERROR: test_iterable_side_input 
(apache_beam.transforms.sideinputs_test.SideInputsTest)
--
Traceback (most recent call last):
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
 line 677, in process
self.do_fn_invoker.invoke_process(windowed_value)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
 line 414, in invoke_process
windowed_value, self.process_method(windowed_value.value))
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
 line 1068, in 
wrapper = lambda x: [fn(x)]
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/testing/util.py",
 line 119, in _equal
'Failed assert: %r == %r' % (sorted_expected, sorted_actual))
apache_beam.testing.util.BeamAssertException: Failed assert: [3, 4, 6, 8] == []




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5616) Several tests fail on Python 3 with Failed assert: [] == [nan]

2018-10-02 Thread Valentyn Tymofieiev (JIRA)
Valentyn Tymofieiev created BEAM-5616:
-

 Summary: Several tests fail on Python 3 with  Failed assert: 
[] == [nan]
 Key: BEAM-5616
 URL: https://issues.apache.org/jira/browse/BEAM-5616
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-py-core
Reporter: Valentyn Tymofieiev


==
ERROR: test_global_fanout (apache_beam.transforms.combiners_test.CombineTest)
--
Traceback (most recent call last):
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
 line 677, in process
self.do_fn_invoker.invoke_process(windowed_value)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
 line 414, in invoke_process
windowed_value, self.process_method(windowed_value.value))
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
 line 1068, in 
wrapper = lambda x: [fn(x)]
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/testing/util.py",
 line 119, in _equal
'Failed assert: %r == %r' % (sorted_expected, sorted_actual))
apache_beam.testing.util.BeamAssertException: Failed assert: [49.5] == [nan]





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5615) Several tests fail on Python 3 with TypeError: 'cmp' is an invalid keyword argument for this function

2018-10-02 Thread Valentyn Tymofieiev (JIRA)
Valentyn Tymofieiev created BEAM-5615:
-

 Summary: Several tests fail on Python 3 with TypeError: 'cmp' is 
an invalid keyword argument for this function
 Key: BEAM-5615
 URL: https://issues.apache.org/jira/browse/BEAM-5615
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-py-harness
Reporter: Valentyn Tymofieiev


ERROR: test_top (apache_beam.transforms.combiners_test.CombineTest)
--
Traceback (most recent call last):
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners_test.py",
 line 89, in test_top
names)  # Note parameter passed to comparator.
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pvalue.py",
 line 111, in __or__
return self.pipeline.apply(ptransform, self)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
 line 467, in apply
label or transform.label)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
 line 477, in apply
return self.apply(transform, pvalueish)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
 line 513, in apply
pvalueish_result = self.runner.apply(transform, pvalueish)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
 line 193, in apply
return m(transform, input)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
 line 199, in apply_PTransform
return transform.expand(input)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/ptransform.py",
 line 759, in expand
return self._fn(pcoll, *args, **kwargs)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners.py",
 line 185, in Of
TopCombineFn(n, compare, key, reverse), *args, **kwargs)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pvalue.py",
 line 111, in __or__
return self.pipeline.apply(ptransform, self)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
 line 513, in apply
pvalueish_result = self.runner.apply(transform, pvalueish)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
 line 193, in apply
return m(transform, input)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
 line 199, in apply_PTransform
return transform.expand(input)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
 line 1251, in expand
default_value = combine_fn.apply([], *self.args, **self.kwargs)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
 line 623, in apply
*args, **kwargs)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners.py",
 line 362, in extract_output
self._sort_buffer(buffer, lt)
  File 
"/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners.py",
 line 295, in _sort_buffer
key=self._key_fn)
TypeError: 'cmp' is an invalid keyword argument for this function





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5519) Spark Streaming Duplicated Encoding/Decoding Effort

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5519?focusedWorklogId=150586=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150586
 ]

ASF GitHub Bot logged work on BEAM-5519:


Author: ASF GitHub Bot
Created on: 03/Oct/18 01:45
Start Date: 03/Oct/18 01:45
Worklog Time Spent: 10m 
  Work Description: amitsela commented on issue #6511: [BEAM-5519] Remove 
call to groupByKey in Spark Streaming.
URL: https://github.com/apache/beam/pull/6511#issuecomment-426485740
 
 
   Sure. Run on a cluster and make sure there's no shuffle on RDDs that contain 
deserialized data, otherwise the runner should use coders before/after a 
shuffle.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150586)
Time Spent: 1h 10m  (was: 1h)

> Spark Streaming Duplicated Encoding/Decoding Effort
> ---
>
> Key: BEAM-5519
> URL: https://issues.apache.org/jira/browse/BEAM-5519
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Winkelman
>Assignee: Kyle Winkelman
>Priority: Major
>  Labels: spark, spark-streaming
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> When using the SparkRunner in streaming mode. There is a call to groupByKey 
> followed by a call to updateStateByKey. BEAM-1815 fixed an issue where this 
> used to cause 2 shuffles but it still causes 2 encode/decode cycles.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3904) Don't use UUID when worker_id is missing

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3904?focusedWorklogId=150585=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150585
 ]

ASF GitHub Bot logged work on BEAM-3904:


Author: ASF GitHub Bot
Created on: 03/Oct/18 01:43
Start Date: 03/Oct/18 01:43
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #6552: [BEAM-3904] Remove 
default worker id in python
URL: https://github.com/apache/beam/pull/6552#issuecomment-426485472
 
 
   Run Python Dataflow ValidatesRunner


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150585)
Time Spent: 20m  (was: 10m)

> Don't use UUID when worker_id is missing
> 
>
> Key: BEAM-3904
> URL: https://issues.apache.org/jira/browse/BEAM-3904
> Project: Beam
>  Issue Type: Task
>  Components: sdk-py-harness
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Removed defaulting to UUID when worker_id is not present and throw exception 
> in worker_id_interceptor.py after we have rolled out the corresponding 
> container changes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3904) Don't use UUID when worker_id is missing

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3904?focusedWorklogId=150584=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150584
 ]

ASF GitHub Bot logged work on BEAM-3904:


Author: ASF GitHub Bot
Created on: 03/Oct/18 01:41
Start Date: 03/Oct/18 01:41
Worklog Time Spent: 10m 
  Work Description: angoenka opened a new pull request #6552: [BEAM-3904] 
Remove default worker id in python
URL: https://github.com/apache/beam/pull/6552
 
 
   **Please** add a meaningful description for your change here
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/)
 | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150584)
Time Spent: 10m
Remaining Estimate: 0h

> Don't use UUID when worker_id is missing
> 
>
> Key: BEAM-3904
> URL: https://issues.apache.org/jira/browse/BEAM-3904
> Project: Beam
>  Issue Type: Task
>  Components: sdk-py-harness
>Reporter: Ankur Goenka
>  

[jira] [Commented] (BEAM-5614) Using gs:// paths without first doing a "gcloud auth" gives an unhelpful error message

2018-10-02 Thread Udi Meiri (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636321#comment-16636321
 ] 

Udi Meiri commented on BEAM-5614:
-

Full output:

WARNING: Your application has authenticated using end user credentials from 
Google Cloud SDK. We recommend that most server applications use service 
accounts instead. If your application continues to use end user credentials 
from Cloud SDK, you might receive a "quota exceeded" or
"API not enabled" error. For more information about service accounts, see 
https://cloud.google.com/docs/authentication/.
Exception in thread "main" java.lang.RuntimeException: Failed to construct 
instance from factory method DataflowRunner#fromOptions(interface 
org.apache.beam.sdk.options.PipelineOptions)
at 
org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod(InstanceBuilder.java:224)



at 
org.apache.beam.sdk.util.InstanceBuilder.build(InstanceBuilder.java:155)
  
at 
org.apache.beam.sdk.PipelineRunner.fromOptions(PipelineRunner.java:55)  

at org.apache.beam.sdk.Pipeline.create(Pipeline.java:145)   

  
at org.apache.beam.examples.WordCount.runWordCount(WordCount.java:176)  
   
at org.apache.beam.examples.WordCount.main(WordCount.java:192)  
 
Caused by: java.lang.reflect.InvocationTargetException  
 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)  

   
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)   

at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)


 
at java.lang.reflect.Method.invoke(Method.java:498) 


at 
org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod(InstanceBuilder.java:214)
  
... 5 more  
  
Caused by: java.lang.IllegalArgumentException: DataflowRunner requires 
gcpTempLocation, but failed to retrieve a value from PipelineOptions

  
at 
org.apache.beam.runners.dataflow.DataflowRunner.fromOptions(DataflowRunner.java:243)
 
... 10 more 
 
Caused by: java.lang.IllegalArgumentException: Error constructing default value 
for gcpTempLocation: tempLocation is not a valid GCS path, 
gs://XXX/staging/.  
  
at 
org.apache.beam.sdk.extensions.gcp.options.GcpOptions$GcpTempLocationFactory.create(GcpOptions.java:255)
 
at 
org.apache.beam.sdk.extensions.gcp.options.GcpOptions$GcpTempLocationFactory.create(GcpOptions.java:232)

at 
org.apache.beam.sdk.options.ProxyInvocationHandler.returnDefaultHelper(ProxyInvocationHandler.java:592)
at 
org.apache.beam.sdk.options.ProxyInvocationHandler.getDefault(ProxyInvocationHandler.java:533)
   
at 
org.apache.beam.sdk.options.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:158)
 
at com.sun.proxy.$Proxy15.getGcpTempLocation(Unknown Source)


 
at 
org.apache.beam.runners.dataflow.DataflowRunner.fromOptions(DataflowRunner.java:241)

 

[jira] [Updated] (BEAM-5614) Using gs:// paths without first doing a "gcloud auth" gives an unhelpful error message

2018-10-02 Thread Udi Meiri (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri updated BEAM-5614:

Summary: Using gs:// paths without first doing a "gcloud auth" gives an 
unhelpful error message  (was: Using gs:// paths without first doing a "gcloud 
auth" give unhelpful error message)

> Using gs:// paths without first doing a "gcloud auth" gives an unhelpful 
> error message
> --
>
> Key: BEAM-5614
> URL: https://issues.apache.org/jira/browse/BEAM-5614
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Udi Meiri
>Assignee: Chamikara Jayalath
>Priority: Major
>
> Users see an error like:
> java.lang.IllegalArgumentException: Error constructing default value for 
> gcpTempLocation: tempLocation is not a valid GCS path, 
> gs://clouddfe-vanya/staging/.
> Also reported here: 
> https://stackoverflow.com/questions/43026371/apache-beam-minimalwordcount-example-with-dataflow-runner-on-eclipse



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5614) Using gs:// paths without first doing a "gcloud auth" gives an unhelpful error message

2018-10-02 Thread Udi Meiri (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri updated BEAM-5614:

Description: 
Users see an error like:
java.lang.IllegalArgumentException: Error constructing default value for 
gcpTempLocation: tempLocation is not a valid GCS path, gs://bucket/path/.

Also reported here: 
https://stackoverflow.com/questions/43026371/apache-beam-minimalwordcount-example-with-dataflow-runner-on-eclipse

  was:
Users see an error like:
java.lang.IllegalArgumentException: Error constructing default value for 
gcpTempLocation: tempLocation is not a valid GCS path, 
gs://clouddfe-vanya/staging/.

Also reported here: 
https://stackoverflow.com/questions/43026371/apache-beam-minimalwordcount-example-with-dataflow-runner-on-eclipse


> Using gs:// paths without first doing a "gcloud auth" gives an unhelpful 
> error message
> --
>
> Key: BEAM-5614
> URL: https://issues.apache.org/jira/browse/BEAM-5614
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Udi Meiri
>Assignee: Chamikara Jayalath
>Priority: Major
>
> Users see an error like:
> java.lang.IllegalArgumentException: Error constructing default value for 
> gcpTempLocation: tempLocation is not a valid GCS path, gs://bucket/path/.
> Also reported here: 
> https://stackoverflow.com/questions/43026371/apache-beam-minimalwordcount-example-with-dataflow-runner-on-eclipse



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5614) Using gs:// paths without first doing a "gcloud auth" give unhelpful error message

2018-10-02 Thread Udi Meiri (JIRA)
Udi Meiri created BEAM-5614:
---

 Summary: Using gs:// paths without first doing a "gcloud auth" 
give unhelpful error message
 Key: BEAM-5614
 URL: https://issues.apache.org/jira/browse/BEAM-5614
 Project: Beam
  Issue Type: Bug
  Components: io-java-gcp
Reporter: Udi Meiri
Assignee: Chamikara Jayalath


Users see an error like:
java.lang.IllegalArgumentException: Error constructing default value for 
gcpTempLocation: tempLocation is not a valid GCS path, 
gs://clouddfe-vanya/staging/.

Also reported here: 
https://stackoverflow.com/questions/43026371/apache-beam-minimalwordcount-example-with-dataflow-runner-on-eclipse



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5467) Python Flink ValidatesRunner job fixes

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5467?focusedWorklogId=150581=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150581
 ]

ASF GitHub Bot logged work on BEAM-5467:


Author: ASF GitHub Bot
Created on: 03/Oct/18 00:59
Start Date: 03/Oct/18 00:59
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #6532: 
[BEAM-5467] NOT_FOR_REVIEW Use process SDKHarness to run flink PVR tests.
URL: https://github.com/apache/beam/pull/6532#discussion_r222155981
 
 

 ##
 File path: sdks/python/scripts/process_sdk_worker.sh
 ##
 @@ -0,0 +1,43 @@
+#!/bin/bash
+#
+#Licensed to the Apache Software Foundation (ASF) under one or more
+#contributor license agreements.  See the NOTICE file distributed with
+#this work for additional information regarding copyright ownership.
+#The ASF licenses this file to You under the Apache License, Version 2.0
+#(the "License"); you may not use this file except in compliance with
+#the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+#
+
+###
+# This script will be run the python sdk worker.
+#
+# Positional Parameters:
+# --id=
+# --logging_endpoint=
+# --artifact_endpoint=
+# --provision_endpoint=
+# --control_endpoint= Python Flink ValidatesRunner job fixes
> --
>
> Key: BEAM-5467
> URL: https://issues.apache.org/jira/browse/BEAM-5467
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Minor
>  Labels: portability-flink
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Add status to README
> Rename script and job for consistency
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5022) Move beam-sdks-java-fn-execution#createPortableValidatesRunnerTask to BeamModulePlugin

2018-10-02 Thread Ankur Goenka (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Goenka resolved BEAM-5022.

   Resolution: Fixed
Fix Version/s: 2.8.0

> Move beam-sdks-java-fn-execution#createPortableValidatesRunnerTask to 
> BeamModulePlugin
> --
>
> Key: BEAM-5022
> URL: https://issues.apache.org/jira/browse/BEAM-5022
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system, runner-flink
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
> Fix For: 2.8.0
>
>
> Move beam-sdks-java-fn-execution#createPortableValidatesRunnerTask to 
> BeamModulePlugin So that it can be used by other portable runners tests.
>  
> Also Create an interface TestJobserverDriver and make the drivers extend it 
> instead of using reflection start the Jobserver.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5090) Use topological sort during ProcessBundle in Java SDKHarness

2018-10-02 Thread Ankur Goenka (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Goenka updated BEAM-5090:
---
Labels: newbie  (was: )

> Use topological sort during ProcessBundle in Java SDKHarness
> 
>
> Key: BEAM-5090
> URL: https://issues.apache.org/jira/browse/BEAM-5090
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-harness
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
>  Labels: newbie
>
> In reference to comment 
> [https://github.com/apache/beam/pull/6093#issuecomment-410831830]
>  * Use QueryablePipeline#getTopologicallyOrderedTransforms and execute 
> processBundle requests.
>  * Explore: is it worth caching the sorted structure when registReuest is 
> received.
>  * Also, explore how we can handle cycles in the execution stage in process 
> bundle request if any.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5315) Finish Python 3 porting for io module

2018-10-02 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636297#comment-16636297
 ] 

Valentyn Tymofieiev commented on BEAM-5315:
---

Note that Datastore dependency currently used in Beam is not 
python3-compatible, so we'll have to skip Datastore tests.

https://issues.apache.org/jira/browse/BEAM-4543

> Finish Python 3 porting for io module
> -
>
> Key: BEAM-5315
> URL: https://issues.apache.org/jira/browse/BEAM-5315
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Matthias Feys
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-4826) Flink runner sends bad flatten to SDK

2018-10-02 Thread Ankur Goenka (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Goenka resolved BEAM-4826.

   Resolution: Fixed
Fix Version/s: 2.8.0

> Flink runner sends bad flatten to SDK
> -
>
> Key: BEAM-4826
> URL: https://issues.apache.org/jira/browse/BEAM-4826
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Henning Rohde
>Assignee: Ankur Goenka
>Priority: Major
>  Labels: portability
> Fix For: 2.8.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> For a Go flatten test w/ 3 input, the Flink runner splits this into 3 bundle 
> descriptors. But it sends the original 3-input flatten but w/ 1 actual input 
> present in each bundle descriptor. This is inconsistent and the SDK shouldn't 
> expect dangling PCollections. In contrast, Dataflow removes the flatten when 
> it does the same split.
> Snippet:
> register: <
>   process_bundle_descriptor: <
> id: "3"
> transforms: <
>   key: "e4"
>   value: <
> unique_name: "github.com/apache/beam/sdks/go/pkg/beam.createFn'1"
> spec: <
>   urn: "urn:beam:transform:pardo:v1"
>   payload: [...]
> >
> inputs: <
>   key: "i0"
>   value: "n3"
> >
> outputs: <
>   key: "i0"
>   value: "n4"
> >
>   >
> >
> transforms: <
>   key: "e7"
>   value: <
> unique_name: "Flatten"
> spec: <
>   urn: "beam:transform:flatten:v1"
> >
> inputs: <
>   key: "i0"
>   value: "n2"
> >
> inputs: <
>   key: "i1"
>   value: "n4" . // <--- only one present.
> >
> inputs: <
>   key: "i2"
>   value: "n6"
> >
> outputs: <
>   key: "i0"
>   value: "n7"
> >
>   >
> >
> [...]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5613) Snapshot Python dependencies and add to Python_NightlySnapshot

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5613?focusedWorklogId=150580=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150580
 ]

ASF GitHub Bot logged work on BEAM-5613:


Author: ASF GitHub Bot
Created on: 03/Oct/18 00:44
Start Date: 03/Oct/18 00:44
Worklog Time Spent: 10m 
  Work Description: markflyhigh opened a new pull request #6551: 
[BEAM-5613] Snapshot of Python depedency and add it to nightly snapshot job
URL: https://github.com/apache/beam/pull/6551
 
 
   This changes make a snapshot of Python dependency and publish to same GCS 
directory that nightly snapshot job use. It helps to track Python SDK 
dependencies change and build dependency checking tools based on those 
published data.
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/)
 | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150580)
Time Spent: 10m
Remaining Estimate: 0h

> Snapshot Python dependencies and add to Python_NightlySnapshot
> 

[jira] [Resolved] (BEAM-5194) Pipeline options with multi value are not deserialized correctly from map

2018-10-02 Thread Ankur Goenka (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Goenka resolved BEAM-5194.

   Resolution: Fixed
Fix Version/s: 2.8.0

> Pipeline options with multi value are not deserialized correctly from map
> -
>
> Key: BEAM-5194
> URL: https://issues.apache.org/jira/browse/BEAM-5194
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
> Fix For: 2.8.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/7c41e0a915083bd3b1fe52c2a417fa38a00e6463/sdks/python/apache_beam/options/pipeline_options.py#L171]
>  
> Multiple options are converted to strings and added to flags which causes 
> wrong deserialization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5190) Python pipeline options are not picked correctly by PortableRunner

2018-10-02 Thread Ankur Goenka (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Goenka resolved BEAM-5190.

   Resolution: Fixed
Fix Version/s: 2.8.0

> Python pipeline options are not picked correctly by PortableRunner
> --
>
> Key: BEAM-5190
> URL: https://issues.apache.org/jira/browse/BEAM-5190
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
> Fix For: 2.8.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Python SDK worker is deserializing the pipeline options to dictionary instead 
> of PipelineOptions
> Sample log
> [grpc-default-executor-2] INFO sdk_worker_main.main - Python sdk harness 
> started with pipeline_options: \{u'beam:option:flink_master:v1': u'[auto]', 
> u'beam:option:streaming:v1': False, u'beam:option:experiments:v1': 
> [u'beam_fn_api', u'worker_threads=50'], u'beam:option:dry_run:v1': False, 
> u'beam:option:runner:v1': None, u'beam:option:profile_memory:v1': False, 
> u'beam:option:runtime_type_check:v1': False, u'beam:option:region:v1': 
> u'us-central1', u'beam:option:options_id:v1': 1, u'beam:option:no_auth:v1': 
> False, u'beam:option:dataflow_endpoint:v1': 
> u'https://dataflow.googleapis.com', u'beam:option:sdk_location:v1': 
> u'/usr/local/google/home/goenka/d/work/beam/beam/sdks/python/dist/apache-beam-2.7.0.dev0.tar.gz',
>  u'beam:option:direct_runner_use_stacked_bundle:v1': True, 
> u'beam:option:save_main_session:v1': True, 
> u'beam:option:type_check_strictness:v1': u'DEFAULT_TO_ANY', 
> u'beam:option:profile_cpu:v1': False, u'beam:option:job_endpoint:v1': 
> u'localhost:8099', u'beam:option:job_name:v1': 
> u'BeamApp-goenka-0822071645-48ae1008', u'beam:option:temp_location:v1': 
> u'gs://clouddfe-goenka/tmp/', u'beam:option:app_name:v1': None, 
> u'beam:option:project:v1': u'google.com:clouddfe', 
> u'beam:option:pipeline_type_check:v1': True, 
> u'beam:option:staging_location:v1': u'gs://clouddfe-goenka/tmp/staging'} 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5187) Create a ProcessJobBundleFactory for non-dockerized SDK harness

2018-10-02 Thread Ankur Goenka (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636285#comment-16636285
 ] 

Ankur Goenka commented on BEAM-5187:


This seems to be done. Shall we close it?

> Create a ProcessJobBundleFactory for non-dockerized SDK harness
> ---
>
> Key: BEAM-5187
> URL: https://issues.apache.org/jira/browse/BEAM-5187
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> As discussed on the mailing list [1], we want to giver users an option to 
> execute portable pipelines without Docker. Analog to the 
> {{DockerJobBundleFactory}}, a {{ProcessJobBundleFactory}} could be added to 
> directly fork SDK harness processes.
> Artifacts will be provided by an artifact directory or could be setup similar 
> to the existing bootstrapping code ("boot.go") which we use for containers.
> The process-based execution can optionally be configured via the pipeline 
> options.
> [1] 
> [https://lists.apache.org/thread.html/d8b81e9f74f77d74c8b883cda80fa48efdcaf6ac2ad313c4fe68795a@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-2769) Java SDK support for submitting a Portable Pipeline

2018-10-02 Thread Ankur Goenka (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Goenka resolved BEAM-2769.

   Resolution: Fixed
Fix Version/s: 2.8.0

> Java SDK support for submitting a Portable Pipeline
> ---
>
> Key: BEAM-2769
> URL: https://issues.apache.org/jira/browse/BEAM-2769
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Ankur Goenka
>Priority: Major
>  Labels: portability
> Fix For: 2.8.0
>
>
> The Java codebase should provide a way to submit a Job to a Job Service.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-2769) Java SDK support for submitting a Portable Pipeline

2018-10-02 Thread Ankur Goenka (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-2769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636280#comment-16636280
 ] 

Ankur Goenka commented on BEAM-2769:


yes. We can submit portable pipelines now.

> Java SDK support for submitting a Portable Pipeline
> ---
>
> Key: BEAM-2769
> URL: https://issues.apache.org/jira/browse/BEAM-2769
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Ankur Goenka
>Priority: Major
>  Labels: portability
> Fix For: 2.8.0
>
>
> The Java codebase should provide a way to submit a Job to a Job Service.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5288) Modify Environment to support non-dockerized SDK harness deployments

2018-10-02 Thread Ankur Goenka (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Goenka resolved BEAM-5288.

   Resolution: Fixed
Fix Version/s: 2.8.0

> Modify Environment to support non-dockerized SDK harness deployments 
> -
>
> Key: BEAM-5288
> URL: https://issues.apache.org/jira/browse/BEAM-5288
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Maximilian Michels
>Assignee: Ankur Goenka
>Priority: Major
> Fix For: 2.8.0
>
>  Time Spent: 16h 40m
>  Remaining Estimate: 0h
>
> As of mailing discussions and BEAM-5187, it has become clear that we need to 
> extend the Environment information. In addition to the Docker environment, 
> the extended environment holds deployment options for 1) a process-based 
> environment, 2) an externally managed environment.
> The proto definition, as of now, looks as follows:
> {noformat}
>  message Environment {
>// (Required) The URN of the payload
>string urn = 1;
>// (Optional) The data specifying any parameters to the URN. If
>// the URN does not require any arguments, this may be omitted.
>bytes payload = 2;
>  }
>  message StandardEnvironments {
>enum Environments {
>  DOCKER = 0 [(beam_urn) = "beam:env:docker:v1"];
>  PROCESS = 1 [(beam_urn) = "beam:env:process:v1"];
>  EXTERNAL = 2 [(beam_urn) = "beam:env:external:v1"];
>}
>  }
>  // The payload of a Docker image
>  message DockerPayload {
>string container_image = 1;  // implicitly linux_amd64.
>  }
>  message ProcessPayload {
>string os = 1;  // "linux", "darwin", ..
>string arch = 2;  // "amd64", ..
>string command = 3; // process to execute
>map env = 4; // environment variables
>  }
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5417) FileSystems.match behaviour diff between GCS and local file system

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5417?focusedWorklogId=150579=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150579
 ]

ASF GitHub Bot logged work on BEAM-5417:


Author: ASF GitHub Bot
Created on: 03/Oct/18 00:23
Start Date: 03/Oct/18 00:23
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #6423: [BEAM-5417] Parity 
between GCS and local match
URL: https://github.com/apache/beam/pull/6423#issuecomment-426473476
 
 
   There's a small isort fix remaining:
   ```
   00:40:43 ERROR: 
/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/io/localfilesystem_test.py
 Imports are incorrectly sorted.
   00:40:43 --- 
/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/io/localfilesystem_test.py:before
2018-10-02 07:38:10.365763
   00:40:43 +++ 
/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/io/localfilesystem_test.py:after
 2018-10-02 07:40:39.716307
   00:40:43 @@ -28,8 +28,8 @@
   00:40:43  import unittest
   00:40:43  
   00:40:43  import mock
   00:40:43 +from parameterized import param
   00:40:43  from parameterized import parameterized
   00:40:43 -from parameterized import param
   00:40:43  
   00:40:43  from apache_beam.io import localfilesystem
   00:40:43  from apache_beam.io.filesystem import BeamIOError
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150579)
Time Spent: 5h 50m  (was: 5h 40m)

> FileSystems.match behaviour diff between GCS and local file system
> --
>
> Key: BEAM-5417
> URL: https://issues.apache.org/jira/browse/BEAM-5417
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.5.0, 2.6.0
>Reporter: Joar Wandborg
>Assignee: Chamikara Jayalath
>Priority: Major
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> Given the directory structure:
>  
> {noformat}
> .
> ├── filesystem-match-test
> │   ├── a
> │   │   └── file.txt
> │   └── b
> │   └── file.txt
> └── filesystem-match-test.py
> {noformat}
>  
> Where {{filesystem-match-test.py}} contains:
> {code:python}
> from __future__ import print_function
> import os
> import posixpath
> from apache_beam.io.filesystem import MatchResult
> from apache_beam.io.filesystems import FileSystems
> BASES = [
> os.path.join(os.path.dirname(__file__), "./"),
> "gs://my-bucket/test/",
> ]
> pattern = "filesystem-match-test/*/file.txt"
> for base_path in BASES:
> full_pattern = posixpath.join(base_path, pattern)
> print("full_pattern: {}".format(full_pattern))
> match_result = FileSystems.match([full_pattern])[0]  # type: MatchResult
> print("metadata list: {}".format(match_result.metadata_list))
> {code}
> Running {{python filesystem-match-test.py}} does not match any files locally, 
> but does match files on GCS:
> {noformat}
> full_pattern: ./filesystem-match-test/*/file.txt
> metadata list: []
> full_pattern: gs://my-bucket/test/filesystem-match-test/*/file.txt
> metadata list: 
> [FileMetadata(gs://my-bucket/test/filesystem-match-test/a/file.txt, 6), 
> FileMetadata(gs://my-bucket/test/filesystem-match-test/b/file.txt, 6)]
> {noformat}
> The expected result is that a/file.txt and b/file.txt should be matched for 
> both patterns.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5417) FileSystems.match behaviour diff between GCS and local file system

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5417?focusedWorklogId=150578=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150578
 ]

ASF GitHub Bot logged work on BEAM-5417:


Author: ASF GitHub Bot
Created on: 03/Oct/18 00:22
Start Date: 03/Oct/18 00:22
Worklog Time Spent: 10m 
  Work Description: udim commented on a change in pull request #6423: 
[BEAM-5417] Parity between GCS and local match
URL: https://github.com/apache/beam/pull/6423#discussion_r222150749
 
 

 ##
 File path: sdks/python/apache_beam/io/filesystem.py
 ##
 @@ -531,24 +530,117 @@ def _list(self, dir_or_prefix):
 """
 raise NotImplementedError
 
+  @staticmethod
+  def _split_scheme(url_or_path):
+match = re.match(r'(^[a-z]+)://(.*)', url_or_path)
+if match is not None:
+  return match.groups()
+return None, url_or_path
+
+  @staticmethod
+  def _combine_scheme(scheme, path):
+if scheme is None:
+  return path
+return '{}://{}'.format(scheme, path)
+
   def _url_dirname(self, url_or_path):
 """Like posixpath.dirname, but preserves scheme:// prefix.
 
 Args:
   url_or_path: A string in the form of scheme://some/path OR /some/path.
 """
-match = re.match(r'([a-z]+://)(.*)', url_or_path)
-if match is None:
-  return posixpath.dirname(url_or_path)
-url_prefix, path = match.groups()
-return url_prefix + posixpath.dirname(path)
+scheme, path = self._split_scheme(url_or_path)
 
 Review comment:
   Alright, let's keep it split.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150578)
Time Spent: 5h 40m  (was: 5.5h)

> FileSystems.match behaviour diff between GCS and local file system
> --
>
> Key: BEAM-5417
> URL: https://issues.apache.org/jira/browse/BEAM-5417
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.5.0, 2.6.0
>Reporter: Joar Wandborg
>Assignee: Chamikara Jayalath
>Priority: Major
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> Given the directory structure:
>  
> {noformat}
> .
> ├── filesystem-match-test
> │   ├── a
> │   │   └── file.txt
> │   └── b
> │   └── file.txt
> └── filesystem-match-test.py
> {noformat}
>  
> Where {{filesystem-match-test.py}} contains:
> {code:python}
> from __future__ import print_function
> import os
> import posixpath
> from apache_beam.io.filesystem import MatchResult
> from apache_beam.io.filesystems import FileSystems
> BASES = [
> os.path.join(os.path.dirname(__file__), "./"),
> "gs://my-bucket/test/",
> ]
> pattern = "filesystem-match-test/*/file.txt"
> for base_path in BASES:
> full_pattern = posixpath.join(base_path, pattern)
> print("full_pattern: {}".format(full_pattern))
> match_result = FileSystems.match([full_pattern])[0]  # type: MatchResult
> print("metadata list: {}".format(match_result.metadata_list))
> {code}
> Running {{python filesystem-match-test.py}} does not match any files locally, 
> but does match files on GCS:
> {noformat}
> full_pattern: ./filesystem-match-test/*/file.txt
> metadata list: []
> full_pattern: gs://my-bucket/test/filesystem-match-test/*/file.txt
> metadata list: 
> [FileMetadata(gs://my-bucket/test/filesystem-match-test/a/file.txt, 6), 
> FileMetadata(gs://my-bucket/test/filesystem-match-test/b/file.txt, 6)]
> {noformat}
> The expected result is that a/file.txt and b/file.txt should be matched for 
> both patterns.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=150577=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150577
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 03/Oct/18 00:21
Start Date: 03/Oct/18 00:21
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #6550: [BEAM-4176] 
Correctly deserialize pipeline options on Fn harness
URL: https://github.com/apache/beam/pull/6550#issuecomment-426473092
 
 
   R: @mxm @tweise @ryan-williams 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150577)
Time Spent: 25h 50m  (was: 25h 40m)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Priority: Major
> Attachments: Screen Shot 2018-08-14 at 4.18.31 PM.png, Screen Shot 
> 2018-09-03 at 11.07.38 AM.png
>
>  Time Spent: 25h 50m
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=150576=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150576
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 03/Oct/18 00:20
Start Date: 03/Oct/18 00:20
Worklog Time Spent: 10m 
  Work Description: angoenka opened a new pull request #6550: [BEAM-4176] 
Correctly deserialize pipeline options on Fn harness
URL: https://github.com/apache/beam/pull/6550
 
 
   Desrialize the pipeline options using protobuf libraries
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/)
 | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150576)
Time Spent: 25h 40m  (was: 25.5h)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue 

[beam] branch asf-site updated: Publishing website 2018/10/03 00:01:08 at commit c48f429

2018-10-02 Thread git-site-role
This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 5a67f50  Publishing website 2018/10/03 00:01:08 at commit c48f429
5a67f50 is described below

commit 5a67f500d0e03f616baed3498eb042a5bf80d720
Author: jenkins 
AuthorDate: Wed Oct 3 00:01:09 2018 +

Publishing website 2018/10/03 00:01:08 at commit c48f429



Jenkins build is back to normal : beam_PreCommit_Website_Cron #132

2018-10-02 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : beam_PostCommit_Python_Verify #6149

2018-10-02 Thread Apache Jenkins Server
See 




[jira] [Created] (BEAM-5613) Snapshot Python dependencies and add to Python_NightlySnapshot

2018-10-02 Thread Mark Liu (JIRA)
Mark Liu created BEAM-5613:
--

 Summary: Snapshot Python dependencies and add to 
Python_NightlySnapshot
 Key: BEAM-5613
 URL: https://issues.apache.org/jira/browse/BEAM-5613
 Project: Beam
  Issue Type: Improvement
  Components: dependencies
Reporter: Mark Liu
Assignee: Mark Liu


Python SDK depends on a list of libraries without specific versions. Unexpected 
or unaware version change could happen if there is a new release or range 
change in setup.py. This can be detected by dependency checking tools which 
require to track dependency list and their versions periodically. 

We could snapshot dependencies periodically using pip freeze and python 
setup.py egg_info and upload them to a public accessible location to benefit 
people who want to check dependency info or building related tools.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-4625) Support ALL and SOME kind of comparison operator

2018-10-02 Thread Rui Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang reassigned BEAM-4625:
--

Assignee: (was: Xu Mingmin)

> Support ALL and SOME kind of comparison operator
> 
>
> Key: BEAM-4625
> URL: https://issues.apache.org/jira/browse/BEAM-4625
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-sql
>Reporter: Rui Wang
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5610) Make BeamCalciteTable public

2018-10-02 Thread Rui Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636228#comment-16636228
 ] 

Rui Wang commented on BEAM-5610:


Instead, we might can make BeamCalciteTable as an inner class of 
BeamCalciteSchema. 

> Make BeamCalciteTable public
> 
>
> Key: BEAM-5610
> URL: https://issues.apache.org/jira/browse/BEAM-5610
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-5596) BeamSQL as a Library

2018-10-02 Thread Rui Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang closed BEAM-5596.
--
   Resolution: Not A Problem
Fix Version/s: Not applicable

> BeamSQL as a Library
> 
>
> Key: BEAM-5596
> URL: https://issues.apache.org/jira/browse/BEAM-5596
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: Not applicable
>
>
> Build BeamSQL as a jar which does not include any extra dependencies. 
> Therefore we can use this jar to use BeamSQL as a lib.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5596) BeamSQL as a Library

2018-10-02 Thread Rui Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636226#comment-16636226
 ] 

Rui Wang commented on BEAM-5596:


`./gradlew jar` indeed builds a unshaded jar.

> BeamSQL as a Library
> 
>
> Key: BEAM-5596
> URL: https://issues.apache.org/jira/browse/BEAM-5596
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: Not applicable
>
>
> Build BeamSQL as a jar which does not include any extra dependencies. 
> Therefore we can use this jar to use BeamSQL as a lib.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5612) Add tox suites to exercise unit tests using Python3 interpreter with cython, and with gcp dependencies.

2018-10-02 Thread Valentyn Tymofieiev (JIRA)
Valentyn Tymofieiev created BEAM-5612:
-

 Summary: Add tox suites to exercise unit tests using Python3 
interpreter with cython, and with gcp dependencies.
 Key: BEAM-5612
 URL: https://issues.apache.org/jira/browse/BEAM-5612
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-py-core
Reporter: Valentyn Tymofieiev
Assignee: Ahmet Altay






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] branch asf-site updated: Publishing website 2018/10/02 23:10:57 at commit c48f429

2018-10-02 Thread git-site-role
This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 318ff2f  Publishing website 2018/10/02 23:10:57 at commit c48f429
318ff2f is described below

commit 318ff2f2cf75a3b19ba758da8ebc324dfbd97253
Author: jenkins 
AuthorDate: Tue Oct 2 23:10:58 2018 +

Publishing website 2018/10/02 23:10:57 at commit c48f429



[jira] [Commented] (BEAM-5125) beam_PostCommit_Java_GradleBuild org.apache.beam.runners.flink PortableExecutionTest testExecution_1_ flaky

2018-10-02 Thread Scott Wegner (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636215#comment-16636215
 ] 

Scott Wegner commented on BEAM-5125:


The 2018/08/23 deadline has passed; can this be closed?

> beam_PostCommit_Java_GradleBuild org.apache.beam.runners.flink 
> PortableExecutionTest testExecution_1_ flaky
> ---
>
> Key: BEAM-5125
> URL: https://issues.apache.org/jira/browse/BEAM-5125
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Test fails in both: post and precommit tests. Fails more often in pre-commits.
> Pre-commit history: 
> [https://builds.apache.org/job/beam_PreCommit_Java_Phrase/180/testReport/junit/org.apache.beam.runners.flink/PortableExecutionTest/testExecution_1_/history/]
> Post-commit history: 
> [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1223/testReport/junit/org.apache.beam.runners.flink/PortableExecutionTest/testExecution_1_/history/?start=75]
> Sample job:
> https://builds.apache.org/job/beam_PreCommit_Java_Phrase/180/testReport/junit/org.apache.beam.runners.flink/PortableExecutionTest/testExecution_1_/
> Log:
> java.lang.AssertionError: job state expected: but was: at 
> org.junit.Assert.fail(Assert.java:88) at 
> org.junit.Assert.failNotEquals(Assert.java:834) at 
> org.junit.Assert.assertEquals(Assert.java:118) at 
> org.apache.beam.runners.flink.PortableExecutionTest.testExecution(PortableExecutionTest.java:177)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5534) AutocompleteIT and WikiTopSessionsIT fail in google testing

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5534?focusedWorklogId=150560=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150560
 ]

ASF GitHub Bot logged work on BEAM-5534:


Author: ASF GitHub Bot
Created on: 02/Oct/18 23:09
Start Date: 02/Oct/18 23:09
Worklog Time Spent: 10m 
  Work Description: jasonkuster closed pull request #6528: [BEAM-5534] 
Revert "Merge pull request #6411 from huygaa11/revert-6311-twsit"
URL: https://github.com/apache/beam/pull/6528
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/examples/java/src/main/java/org/apache/beam/examples/complete/TopWikipediaSessions.java
 
b/examples/java/src/main/java/org/apache/beam/examples/complete/TopWikipediaSessions.java
index acc083ba0df..b33bbc00732 100644
--- 
a/examples/java/src/main/java/org/apache/beam/examples/complete/TopWikipediaSessions.java
+++ 
b/examples/java/src/main/java/org/apache/beam/examples/complete/TopWikipediaSessions.java
@@ -19,6 +19,7 @@
 
 import com.google.api.services.bigquery.model.TableRow;
 import java.io.IOException;
+import java.math.BigDecimal;
 import java.util.List;
 import org.apache.beam.sdk.Pipeline;
 import org.apache.beam.sdk.io.TextIO;
@@ -76,7 +77,13 @@
 @ProcessElement
 public void processElement(ProcessContext c) {
   TableRow row = c.element();
-  int timestamp = (Integer) row.get("timestamp");
+  int timestamp;
+  // TODO(BEAM-5390): Avoid this workaround.
+  try {
+timestamp = ((BigDecimal) row.get("timestamp")).intValue();
+  } catch (ClassCastException e) {
+timestamp = ((Integer) row.get("timestamp")).intValue();
+  }
   String userName = (String) row.get("contributor_username");
   if (userName != null) {
 // Sets the implicit timestamp field to be used in windowing.
@@ -180,9 +187,9 @@ public void processElement(ProcessContext c) {
   public interface Options extends PipelineOptions {
 @Description("Input specified as a GCS path containing a BigQuery table 
exported as json")
 @Default.String(EXPORTED_WIKI_TABLE)
-String getInput();
+String getWikiInput();
 
-void setInput(String value);
+void setWikiInput(String value);
 
 @Description("File to output results to")
 @Validation.Required
@@ -191,18 +198,21 @@ public void processElement(ProcessContext c) {
 void setOutput(String value);
   }
 
-  public static void main(String[] args) {
-Options options = 
PipelineOptionsFactory.fromArgs(args).withValidation().as(Options.class);
-
+  public static void run(Options options) {
 Pipeline p = Pipeline.create(options);
 
 double samplingThreshold = 0.1;
 
-p.apply(TextIO.read().from(options.getInput()))
+p.apply(TextIO.read().from(options.getWikiInput()))
 .apply(MapElements.via(new ParseTableRowJson()))
 .apply(new ComputeTopSessions(samplingThreshold))
-.apply("Write", 
TextIO.write().withoutSharding().to(options.getOutput()));
+.apply("Write", TextIO.write().to(options.getOutput()));
 
 p.run().waitUntilFinish();
   }
+
+  public static void main(String[] args) {
+Options options = 
PipelineOptionsFactory.fromArgs(args).withValidation().as(Options.class);
+run(options);
+  }
 }
diff --git 
a/examples/java/src/test/java/org/apache/beam/examples/complete/TopWikipediaSessionsIT.java
 
b/examples/java/src/test/java/org/apache/beam/examples/complete/TopWikipediaSessionsIT.java
new file mode 100644
index 000..1f278e7f7a7
--- /dev/null
+++ 
b/examples/java/src/test/java/org/apache/beam/examples/complete/TopWikipediaSessionsIT.java
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.examples.complete;
+
+import java.util.Date;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import 

[beam] branch master updated (be8d469 -> c48f429)

2018-10-02 Thread jaku
This is an automated email from the ASF dual-hosted git repository.

jaku pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from be8d469  [BEAM-5611] Fix failure in testWebsite due to 508 resource 
limit error from www.se-radio.net (#6547)
 add c6fe012  Revert "Merge pull request #6411 from 
huygaa11/revert-6311-twsit"
 add c449442  Fixing odd Options bug
 new c48f429  Merge pull request #6528: [BEAM-5534] TopWikipediaSessionsIT

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../examples/complete/TopWikipediaSessions.java| 26 +++-
 .../TopWikipediaSessionsIT.java}   | 35 ++
 2 files changed, 34 insertions(+), 27 deletions(-)
 copy examples/java/src/test/java/org/apache/beam/examples/{WordCountIT.java => 
complete/TopWikipediaSessionsIT.java} (66%)



[beam] 01/01: Merge pull request #6528: [BEAM-5534] TopWikipediaSessionsIT

2018-10-02 Thread jaku
This is an automated email from the ASF dual-hosted git repository.

jaku pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit c48f429095e5281eb4e2240d35b1b4bd6688e662
Merge: be8d469 c449442
Author: jasonkuster 
AuthorDate: Tue Oct 2 16:09:28 2018 -0700

Merge pull request #6528: [BEAM-5534] TopWikipediaSessionsIT

[BEAM-5534] Revert "Merge pull request #6411 from 
huygaa11/revert-6311-twsit"

 .../examples/complete/TopWikipediaSessions.java| 26 +---
 .../examples/complete/TopWikipediaSessionsIT.java  | 69 ++
 2 files changed, 87 insertions(+), 8 deletions(-)



[jira] [Work logged] (BEAM-5534) AutocompleteIT and WikiTopSessionsIT fail in google testing

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5534?focusedWorklogId=150559=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150559
 ]

ASF GitHub Bot logged work on BEAM-5534:


Author: ASF GitHub Bot
Created on: 02/Oct/18 23:08
Start Date: 02/Oct/18 23:08
Worklog Time Spent: 10m 
  Work Description: jasonkuster commented on issue #6528: [BEAM-5534] 
Revert "Merge pull request #6411 from huygaa11/revert-6311-twsit"
URL: https://github.com/apache/beam/pull/6528#issuecomment-426460097
 
 
   LGTM. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150559)
Time Spent: 1h 10m  (was: 1h)

> AutocompleteIT and WikiTopSessionsIT fail in google testing
> ---
>
> Key: BEAM-5534
> URL: https://issues.apache.org/jira/browse/BEAM-5534
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-5610) Make BeamCalciteTable public

2018-10-02 Thread Rui Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang closed BEAM-5610.
--
   Resolution: Won't Fix
Fix Version/s: Not applicable

> Make BeamCalciteTable public
> 
>
> Key: BEAM-5610
> URL: https://issues.apache.org/jira/browse/BEAM-5610
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5610) Make BeamCalciteTable public

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5610?focusedWorklogId=150558=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150558
 ]

ASF GitHub Bot logged work on BEAM-5610:


Author: ASF GitHub Bot
Created on: 02/Oct/18 23:06
Start Date: 02/Oct/18 23:06
Worklog Time Spent: 10m 
  Work Description: amaliujia closed pull request #6548: [BEAM-5610] Make 
BeamCalciteTable accessible from other packages.
URL: https://github.com/apache/beam/pull/6548
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/BeamCalciteTable.java
 
b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/BeamCalciteTable.java
index fd2eddf171d..0d3701497a1 100644
--- 
a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/BeamCalciteTable.java
+++ 
b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/BeamCalciteTable.java
@@ -40,7 +40,7 @@
 import org.apache.calcite.schema.TranslatableTable;
 
 /** Adapter from {@link BeamSqlTable} to a calcite Table. */
-class BeamCalciteTable extends AbstractQueryableTable
+public class BeamCalciteTable extends AbstractQueryableTable
 implements ModifiableTable, TranslatableTable {
   private final BeamSqlTable beamTable;
   private final Map pipelineOptions;


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150558)
Time Spent: 40m  (was: 0.5h)

> Make BeamCalciteTable public
> 
>
> Key: BEAM-5610
> URL: https://issues.apache.org/jira/browse/BEAM-5610
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PostCommit_Python_VR_Flink #205

2018-10-02 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-2953) Create more advanced Timeseries processing examples using state API

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2953?focusedWorklogId=150555=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150555
 ]

ASF GitHub Bot logged work on BEAM-2953:


Author: ASF GitHub Bot
Created on: 02/Oct/18 22:51
Start Date: 02/Oct/18 22:51
Worklog Time Spent: 10m 
  Work Description: akedin commented on issue #6540: [BEAM-2953] Advanced 
Timeseries examples.
URL: https://github.com/apache/beam/pull/6540#issuecomment-426456588
 
 
   @swegner I'm looking at this


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150555)
Time Spent: 50m  (was: 40m)

> Create more advanced Timeseries processing examples using state API
> ---
>
> Key: BEAM-2953
> URL: https://issues.apache.org/jira/browse/BEAM-2953
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-java
>Affects Versions: 2.1.0
>Reporter: Reza ardeshir rokni
>Assignee: Reuven Lax
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> As described in the phase 1 portion of this solution outline:
> https://cloud.google.com/solutions/correlating-time-series-dataflow
> BEAM can be used to build out some very interesting pre-processing stages for 
> time series data. Some examples that will be useful:
> - Downsampling time series based on simple AVG, MIN, MAX
> - Creating a value for each time window using generatesequence as a seed 
> - Loading the value of a downsample with the previous value (used in FX with 
> previous close being brought into current open value) 
> This will show some concrete examples of keyed state as well as the use of 
> combiners. 
> The samples can also be used to show how you can create a ordered list of 
> values per key from a unbounded topic which has multiple time series keys. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] branch asf-site updated: Publishing website 2018/10/02 22:51:51 at commit be8d469

2018-10-02 Thread git-site-role
This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 4f9af11  Publishing website 2018/10/02 22:51:51 at commit be8d469
4f9af11 is described below

commit 4f9af116331d5a6e2cad3206d911ba0e8d3bb7bf
Author: jenkins 
AuthorDate: Tue Oct 2 22:51:51 2018 +

Publishing website 2018/10/02 22:51:51 at commit be8d469



[jira] [Work logged] (BEAM-5611) testWebsite fails due to HTTP 508 error on se-radio.net

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5611?focusedWorklogId=150554=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150554
 ]

ASF GitHub Bot logged work on BEAM-5611:


Author: ASF GitHub Bot
Created on: 02/Oct/18 22:50
Start Date: 02/Oct/18 22:50
Worklog Time Spent: 10m 
  Work Description: swegner closed pull request #6547: [BEAM-5611] Fix 
failure in testWebsite due to 508 resource limit error from www.se-radio.net
URL: https://github.com/apache/beam/pull/6547
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/website/Rakefile b/website/Rakefile
index 5160bad45f5..e814956451d 100644
--- a/website/Rakefile
+++ b/website/Rakefile
@@ -18,7 +18,8 @@ task :test do
 /jstorm.io/,
 /datatorrent.com/,
 /ai.google/, # https://issues.apache.org/jira/browse/INFRA-16527
-/globenewswire.com/ # https://issues.apache.org/jira/browse/BEAM-5518
+/globenewswire.com/, # https://issues.apache.org/jira/browse/BEAM-5518
+/www.se-radio.net/ # BEAM-5611: Can fail with rate limit HTTP 508 error
 ],
 :parallel => { :in_processes => Etc.nprocessors },
 }).run


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150554)
Time Spent: 20m  (was: 10m)

> testWebsite fails due to HTTP 508 error on se-radio.net
> ---
>
> Key: BEAM-5611
> URL: https://issues.apache.org/jira/browse/BEAM-5611
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Affects Versions: 2.8.0
>Reporter: Scott Wegner
>Assignee: Alan Myrvold
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> See: https://scans.gradle.com/s/pmvqm6he6w422
> Failure when running: 
> ./gradlew :beam-website:testWebsite
> Error includes:
>  - ./generated-content/documentation/resources/index.html
>   *  External link 
> http://www.se-radio.net/2016/10/se-radio-episode-272-frances-perry-on-apache-beam/
>  failed: 508 No error
> rake aborted!
> HTML-Proofer found 1 failure!
> Opening this link in a browser also shows an error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5611) testWebsite fails due to HTTP 508 error on se-radio.net

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5611?focusedWorklogId=150553=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150553
 ]

ASF GitHub Bot logged work on BEAM-5611:


Author: ASF GitHub Bot
Created on: 02/Oct/18 22:50
Start Date: 02/Oct/18 22:50
Worklog Time Spent: 10m 
  Work Description: swegner commented on issue #6547: [BEAM-5611] Fix 
failure in testWebsite due to 508 resource limit error from www.se-radio.net
URL: https://github.com/apache/beam/pull/6547#issuecomment-426456387
 
 
   Note, the link seems to resolve for me now, although it didn't this more. 
Let's add this exclude for now but revisit in the future.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150553)
Time Spent: 10m
Remaining Estimate: 0h

> testWebsite fails due to HTTP 508 error on se-radio.net
> ---
>
> Key: BEAM-5611
> URL: https://issues.apache.org/jira/browse/BEAM-5611
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Affects Versions: 2.8.0
>Reporter: Scott Wegner
>Assignee: Alan Myrvold
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> See: https://scans.gradle.com/s/pmvqm6he6w422
> Failure when running: 
> ./gradlew :beam-website:testWebsite
> Error includes:
>  - ./generated-content/documentation/resources/index.html
>   *  External link 
> http://www.se-radio.net/2016/10/se-radio-episode-272-frances-perry-on-apache-beam/
>  failed: 508 No error
> rake aborted!
> HTML-Proofer found 1 failure!
> Opening this link in a browser also shows an error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] branch master updated (8a523ad -> be8d469)

2018-10-02 Thread scott
This is an automated email from the ASF dual-hosted git repository.

scott pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 8a523ad  Merge pull request #6545 from Ardagan/TestGrafanaDashboards
 add be8d469  [BEAM-5611] Fix failure in testWebsite due to 508 resource 
limit error from www.se-radio.net (#6547)

No new revisions were added by this update.

Summary of changes:
 website/Rakefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)



[jira] [Commented] (BEAM-5611) testWebsite fails due to HTTP 508 error on se-radio.net

2018-10-02 Thread Scott Wegner (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16636207#comment-16636207
 ] 

Scott Wegner commented on BEAM-5611:


I was able to repro the error this morning, although now the link seems to 
resolve again. Let's keep this open for a while to check up on it..

> testWebsite fails due to HTTP 508 error on se-radio.net
> ---
>
> Key: BEAM-5611
> URL: https://issues.apache.org/jira/browse/BEAM-5611
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Affects Versions: 2.8.0
>Reporter: Scott Wegner
>Assignee: Alan Myrvold
>Priority: Major
>
> See: https://scans.gradle.com/s/pmvqm6he6w422
> Failure when running: 
> ./gradlew :beam-website:testWebsite
> Error includes:
>  - ./generated-content/documentation/resources/index.html
>   *  External link 
> http://www.se-radio.net/2016/10/se-radio-episode-272-frances-perry-on-apache-beam/
>  failed: 508 No error
> rake aborted!
> HTML-Proofer found 1 failure!
> Opening this link in a browser also shows an error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5611) testWebsite fails due to HTTP 508 error on se-radio.net

2018-10-02 Thread Scott Wegner (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Wegner updated BEAM-5611:
---
Fix Version/s: (was: 2.8.0)

> testWebsite fails due to HTTP 508 error on se-radio.net
> ---
>
> Key: BEAM-5611
> URL: https://issues.apache.org/jira/browse/BEAM-5611
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Affects Versions: 2.8.0
>Reporter: Scott Wegner
>Assignee: Alan Myrvold
>Priority: Major
>
> See: https://scans.gradle.com/s/pmvqm6he6w422
> Failure when running: 
> ./gradlew :beam-website:testWebsite
> Error includes:
>  - ./generated-content/documentation/resources/index.html
>   *  External link 
> http://www.se-radio.net/2016/10/se-radio-episode-272-frances-perry-on-apache-beam/
>  failed: 508 No error
> rake aborted!
> HTML-Proofer found 1 failure!
> Opening this link in a browser also shows an error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5611) testWebsite fails due to HTTP 508 error on se-radio.net

2018-10-02 Thread Scott Wegner (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Wegner updated BEAM-5611:
---
Description: 
See: https://scans.gradle.com/s/pmvqm6he6w422

Failure when running: 

./gradlew :beam-website:testWebsite

Error includes:

 - ./generated-content/documentation/resources/index.html
  *  External link 
http://www.se-radio.net/2016/10/se-radio-episode-272-frances-perry-on-apache-beam/
 failed: 508 No error
rake aborted!
HTML-Proofer found 1 failure!

Opening this link in a browser also shows an error.

  was:
Failure when running: 

./gradlew :beam-website:testWebsite

Error includes:
 - ./generated-content/blog/2017/02/01/graduation-media-recap.html

  *  External link 
[https://globenewswire.com/news-release/2017/01/10/904692/0/en/The-Apache-Software-Foundation-Announces-Apache-Beam-as-a-Top-Level-Project.html]
 failed: response code 0 means something's wrong.

             It's possible libcurl couldn't connect to the server or perhaps 
the request timed out.

             Sometimes, making too many requests at once also breaks things.

             Either way, the return message (if any) from the server is: Peer 
certificate cannot be authenticated with given CA certificates

rake aborted!

HTML-Proofer found 1 failure!

 

Also fails when running:

curl -v 
[https://globenewswire.com/news-release/2017/01/10/904692/0/en/The-Apache-Software-Foundation-Announces-Apache-Beam-as-a-Top-Level-Project.html]

 

Works fine opening in a browser.


> testWebsite fails due to HTTP 508 error on se-radio.net
> ---
>
> Key: BEAM-5611
> URL: https://issues.apache.org/jira/browse/BEAM-5611
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Affects Versions: 2.8.0
>Reporter: Scott Wegner
>Assignee: Alan Myrvold
>Priority: Major
> Fix For: 2.8.0
>
>
> See: https://scans.gradle.com/s/pmvqm6he6w422
> Failure when running: 
> ./gradlew :beam-website:testWebsite
> Error includes:
>  - ./generated-content/documentation/resources/index.html
>   *  External link 
> http://www.se-radio.net/2016/10/se-radio-episode-272-frances-perry-on-apache-beam/
>  failed: 508 No error
> rake aborted!
> HTML-Proofer found 1 failure!
> Opening this link in a browser also shows an error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5611) testWebsite fails due to HTTP 508 error on se-radio.net

2018-10-02 Thread Scott Wegner (JIRA)
Scott Wegner created BEAM-5611:
--

 Summary: testWebsite fails due to HTTP 508 error on se-radio.net
 Key: BEAM-5611
 URL: https://issues.apache.org/jira/browse/BEAM-5611
 Project: Beam
  Issue Type: Bug
  Components: website
Affects Versions: 2.8.0
Reporter: Scott Wegner
Assignee: Alan Myrvold
 Fix For: 2.8.0


Failure when running: 

./gradlew :beam-website:testWebsite

Error includes:
 - ./generated-content/blog/2017/02/01/graduation-media-recap.html

  *  External link 
[https://globenewswire.com/news-release/2017/01/10/904692/0/en/The-Apache-Software-Foundation-Announces-Apache-Beam-as-a-Top-Level-Project.html]
 failed: response code 0 means something's wrong.

             It's possible libcurl couldn't connect to the server or perhaps 
the request timed out.

             Sometimes, making too many requests at once also breaks things.

             Either way, the return message (if any) from the server is: Peer 
certificate cannot be authenticated with given CA certificates

rake aborted!

HTML-Proofer found 1 failure!

 

Also fails when running:

curl -v 
[https://globenewswire.com/news-release/2017/01/10/904692/0/en/The-Apache-Software-Foundation-Announces-Apache-Beam-as-a-Top-Level-Project.html]

 

Works fine opening in a browser.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2953) Create more advanced Timeseries processing examples using state API

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2953?focusedWorklogId=150545=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150545
 ]

ASF GitHub Bot logged work on BEAM-2953:


Author: ASF GitHub Bot
Created on: 02/Oct/18 22:45
Start Date: 02/Oct/18 22:45
Worklog Time Spent: 10m 
  Work Description: swegner commented on issue #6540: [BEAM-2953] Advanced 
Timeseries examples.
URL: https://github.com/apache/beam/pull/6540#issuecomment-426455299
 
 
   @rezarokni thank you for the contribution.
   
   I see the pre-commit checks are failing due to Spotless violations. Please 
run `./gradlew spotlessApply` in your git branch to automatically apply source 
formatting.
   
   When your contribution is ready for review, please @mention somebody that 
can review your changes, (perhaps @kennknowles)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150545)
Time Spent: 40m  (was: 0.5h)

> Create more advanced Timeseries processing examples using state API
> ---
>
> Key: BEAM-2953
> URL: https://issues.apache.org/jira/browse/BEAM-2953
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-java
>Affects Versions: 2.1.0
>Reporter: Reza ardeshir rokni
>Assignee: Reuven Lax
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> As described in the phase 1 portion of this solution outline:
> https://cloud.google.com/solutions/correlating-time-series-dataflow
> BEAM can be used to build out some very interesting pre-processing stages for 
> time series data. Some examples that will be useful:
> - Downsampling time series based on simple AVG, MIN, MAX
> - Creating a value for each time window using generatesequence as a seed 
> - Loading the value of a downsample with the previous value (used in FX with 
> previous close being brought into current open value) 
> This will show some concrete examples of keyed state as well as the use of 
> combiners. 
> The samples can also be used to show how you can create a ordered list of 
> values per key from a unbounded topic which has multiple time series keys. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5519) Spark Streaming Duplicated Encoding/Decoding Effort

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5519?focusedWorklogId=150536=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150536
 ]

ASF GitHub Bot logged work on BEAM-5519:


Author: ASF GitHub Bot
Created on: 02/Oct/18 22:00
Start Date: 02/Oct/18 22:00
Worklog Time Spent: 10m 
  Work Description: kyle-winkelman commented on issue #6511: [BEAM-5519] 
Remove call to groupByKey in Spark Streaming.
URL: https://github.com/apache/beam/pull/6511#issuecomment-426445070
 
 
   Any recommendations on how best to thoroughly test this? I have run 
validatesRunnerStreaming and the SparkCoGroupByKeyStreamingTest is functioning 
correctly, but I was wondering what additional steps I should take.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150536)
Time Spent: 1h  (was: 50m)

> Spark Streaming Duplicated Encoding/Decoding Effort
> ---
>
> Key: BEAM-5519
> URL: https://issues.apache.org/jira/browse/BEAM-5519
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Winkelman
>Assignee: Kyle Winkelman
>Priority: Major
>  Labels: spark, spark-streaming
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When using the SparkRunner in streaming mode. There is a call to groupByKey 
> followed by a call to updateStateByKey. BEAM-1815 fixed an issue where this 
> used to cause 2 shuffles but it still causes 2 encode/decode cycles.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] branch asf-site updated: Publishing website 2018/10/02 22:00:07 at commit ed45509

2018-10-02 Thread git-site-role
This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 0f379ff  Publishing website 2018/10/02 22:00:07 at commit ed45509
0f379ff is described below

commit 0f379ff01c09382df4ecc2e3d68d3bd77ca61aa7
Author: jenkins 
AuthorDate: Tue Oct 2 22:00:08 2018 +

Publishing website 2018/10/02 22:00:07 at commit ed45509



Jenkins build is back to normal : beam_PostCommit_Website_Publish #63

2018-10-02 Thread Apache Jenkins Server
See 




[jira] [Assigned] (BEAM-3746) Count.globally should override getIncompatibleGlobalWindowErrorMessage to tell the user the usage that is currently only in javadoc

2018-10-02 Thread Scott Wegner (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Wegner reassigned BEAM-3746:
--

Assignee: (was: Kenneth Knowles)

> Count.globally should override getIncompatibleGlobalWindowErrorMessage to 
> tell the user the usage that is currently only in javadoc
> ---
>
> Key: BEAM-3746
> URL: https://issues.apache.org/jira/browse/BEAM-3746
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Labels: beginner, newbie, starter
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> https://beam.apache.org/documentation/sdks/javadoc/2.3.0/org/apache/beam/sdk/transforms/Count.html#globally--
> "Note: if the input collection uses a windowing strategy other than 
> GlobalWindows, use Combine.globally(Count.combineFn()).withoutDefaults() 
> instead."
> But the actual crash a user gets is:
> "java.lang.IllegalStateException: Default values are not supported in 
> Combine.globally() if the output PCollection is not windowed by 
> GlobalWindows. Instead, use Combine.globally().withoutDefaults() to output an 
> empty PCollection if the input PCollection is empty, or 
> Combine.globally().asSingletonView() to get the default output of the 
> CombineFn if the input PCollection is empty."
> There is a method that exists solely to make this actually useful, so we 
> should use it!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3746) Count.globally should override getIncompatibleGlobalWindowErrorMessage to tell the user the usage that is currently only in javadoc

2018-10-02 Thread Scott Wegner (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Wegner reassigned BEAM-3746:
--

Assignee: Kenneth Knowles

> Count.globally should override getIncompatibleGlobalWindowErrorMessage to 
> tell the user the usage that is currently only in javadoc
> ---
>
> Key: BEAM-3746
> URL: https://issues.apache.org/jira/browse/BEAM-3746
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Labels: beginner, newbie, starter
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> https://beam.apache.org/documentation/sdks/javadoc/2.3.0/org/apache/beam/sdk/transforms/Count.html#globally--
> "Note: if the input collection uses a windowing strategy other than 
> GlobalWindows, use Combine.globally(Count.combineFn()).withoutDefaults() 
> instead."
> But the actual crash a user gets is:
> "java.lang.IllegalStateException: Default values are not supported in 
> Combine.globally() if the output PCollection is not windowed by 
> GlobalWindows. Instead, use Combine.globally().withoutDefaults() to output an 
> empty PCollection if the input PCollection is empty, or 
> Combine.globally().asSingletonView() to get the default output of the 
> CombineFn if the input PCollection is empty."
> There is a method that exists solely to make this actually useful, so we 
> should use it!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3746) Count.globally should override getIncompatibleGlobalWindowErrorMessage to tell the user the usage that is currently only in javadoc

2018-10-02 Thread Scott Wegner (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Wegner reassigned BEAM-3746:
--

Assignee: Kenneth Knowles  (was: Scott Wegner)

> Count.globally should override getIncompatibleGlobalWindowErrorMessage to 
> tell the user the usage that is currently only in javadoc
> ---
>
> Key: BEAM-3746
> URL: https://issues.apache.org/jira/browse/BEAM-3746
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Labels: beginner, newbie, starter
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> https://beam.apache.org/documentation/sdks/javadoc/2.3.0/org/apache/beam/sdk/transforms/Count.html#globally--
> "Note: if the input collection uses a windowing strategy other than 
> GlobalWindows, use Combine.globally(Count.combineFn()).withoutDefaults() 
> instead."
> But the actual crash a user gets is:
> "java.lang.IllegalStateException: Default values are not supported in 
> Combine.globally() if the output PCollection is not windowed by 
> GlobalWindows. Instead, use Combine.globally().withoutDefaults() to output an 
> empty PCollection if the input PCollection is empty, or 
> Combine.globally().asSingletonView() to get the default output of the 
> CombineFn if the input PCollection is empty."
> There is a method that exists solely to make this actually useful, so we 
> should use it!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3746) Count.globally should override getIncompatibleGlobalWindowErrorMessage to tell the user the usage that is currently only in javadoc

2018-10-02 Thread Scott Wegner (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Wegner reassigned BEAM-3746:
--

Assignee: Scott Wegner  (was: Kenneth Knowles)

> Count.globally should override getIncompatibleGlobalWindowErrorMessage to 
> tell the user the usage that is currently only in javadoc
> ---
>
> Key: BEAM-3746
> URL: https://issues.apache.org/jira/browse/BEAM-3746
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Scott Wegner
>Priority: Major
>  Labels: beginner, newbie, starter
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> https://beam.apache.org/documentation/sdks/javadoc/2.3.0/org/apache/beam/sdk/transforms/Count.html#globally--
> "Note: if the input collection uses a windowing strategy other than 
> GlobalWindows, use Combine.globally(Count.combineFn()).withoutDefaults() 
> instead."
> But the actual crash a user gets is:
> "java.lang.IllegalStateException: Default values are not supported in 
> Combine.globally() if the output PCollection is not windowed by 
> GlobalWindows. Instead, use Combine.globally().withoutDefaults() to output an 
> empty PCollection if the input PCollection is empty, or 
> Combine.globally().asSingletonView() to get the default output of the 
> CombineFn if the input PCollection is empty."
> There is a method that exists solely to make this actually useful, so we 
> should use it!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4501) Update contribution guide for new website contribution process

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4501?focusedWorklogId=150535=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150535
 ]

ASF GitHub Bot logged work on BEAM-4501:


Author: ASF GitHub Bot
Created on: 02/Oct/18 21:34
Start Date: 02/Oct/18 21:34
Worklog Time Spent: 10m 
  Work Description: tweise commented on a change in pull request #6533: 
[BEAM-4501] Update website contribution and release docs
URL: https://github.com/apache/beam/pull/6533#discussion_r222119885
 
 

 ##
 File path: website/README.md
 ##
 @@ -18,14 +18,26 @@ The Beam website is built using 
[Jekyll](http://jekyllrb.com/). Additionally,
 for additional formatting capabilities, this website uses
 [Twitter Bootstrap](http://getbootstrap.com/).
 
-### Repository Structure
+Documentation generated from source code, such as Javadoc and Pydoc, is stored
+separately on the [beam-site
+repository](https://github.com/apache/beam-site/tree/asf-site/content/documentation/sdks).
 
-This repository contains:
+## Development Workflow with Docker
 
-1. `src/`: the source of the site, including markdown files containing the 
bulk of the content
-1. `content/`: html generated from the markdown (which is what is actually 
hosted on the website)
+### Active development
+
+If you have Docker configured on your machine, the following command may be 
used
+to build and serve the website locally.
+
+$ ./gradlew :beam-website:serveWebsite
+
+Any changes made locally will trigger a rebuild of the website.
 
-## Development Workflow
+You can also run website tests using this command:
+
+$ ./gradlew :beam-website:testWebsite
+
+## Development Workflow without Docker
 
 ### Setup
 
 Review comment:
   Some of it may be of interest for maintaining the docker image itself, but 
should be moved out of the way for regular web site contributor.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150535)
Time Spent: 2h 20m  (was: 2h 10m)

> Update contribution guide for new website contribution process
> --
>
> Key: BEAM-4501
> URL: https://issues.apache.org/jira/browse/BEAM-4501
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Scott Wegner
>Assignee: Udi Meiri
>Priority: Major
>  Labels: beam-site-automation-reliability
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5105) Move load job poll to finishBundle() method to better parallelize execution

2018-10-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5105?focusedWorklogId=150532=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-150532
 ]

ASF GitHub Bot logged work on BEAM-5105:


Author: ASF GitHub Bot
Created on: 02/Oct/18 21:26
Start Date: 02/Oct/18 21:26
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #6416: [BEAM-5105] Better 
parallelize BigQuery load jobs
URL: https://github.com/apache/beam/pull/6416#issuecomment-426436189
 
 
   @aaltay most comments addressed. I just want to write some more unit tests 
before submitting.
   @chamikaramj Do we have an large-scale BQIO tests we can run against actual 
BQ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 150532)
Time Spent: 1h 10m  (was: 1h)

> Move load job poll to finishBundle() method to better parallelize execution
> ---
>
> Key: BEAM-5105
> URL: https://issues.apache.org/jira/browse/BEAM-5105
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Chamikara Jayalath
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> It appears that when we write to BigQuery using WriteTablesDoFn we start a 
> load job and wait for that job to finish.
> [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteTables.java#L318]
>  
> In cases where we are trying to write a PCollection of tables (for example, 
> when user use dynamic destinations feature) this relies on dynamic work 
> rebalancing to parallellize execution of load jobs. If the runner does not 
> support dynamic work rebalancing or does not execute dynamic work rebalancing 
> from some reason this could have significant performance drawbacks. For 
> example, scheduling times for load jobs will add up.
>  
> A better approach might be to start load jobs at process() method but wait 
> for all load jobs to finish at finishBundle() method. This will parallelize 
> any overheads as well as job execution (assuming more than one job is 
> schedule by BQ.).
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   >