[jira] [Work logged] (BEAM-5730) Migrate Java test to use a staged worker jar

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5730?focusedWorklogId=155253&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155253
 ]

ASF GitHub Bot logged work on BEAM-5730:


Author: ASF GitHub Bot
Created on: 17/Oct/18 03:30
Start Date: 17/Oct/18 03:30
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #6694: [BEAM-5730] Migrate 
ITs using DataflowRunner to use custom worker
URL: https://github.com/apache/beam/pull/6694#issuecomment-430477290
 
 
   All tests passed. Please review this PR @lukecwik 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155253)
Time Spent: 40m  (was: 0.5h)

> Migrate Java test to use a staged worker jar
> 
>
> Key: BEAM-5730
> URL: https://issues.apache.org/jira/browse/BEAM-5730
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-dataflow
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-4505) Archive/Retire apache/beam-site repository

2018-10-16 Thread Scott Wegner (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Wegner resolved BEAM-4505.

   Resolution: Fixed
Fix Version/s: Not applicable

> Archive/Retire apache/beam-site repository
> --
>
> Key: BEAM-4505
> URL: https://issues.apache.org/jira/browse/BEAM-4505
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Major
>  Labels: beam-site-automation-reliability
> Fix For: Not applicable
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-4493) Beam-Site Automation Reliability

2018-10-16 Thread Scott Wegner (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Wegner closed BEAM-4493.
--
   Resolution: Fixed
Fix Version/s: Not applicable

This migration is now complete!

> Beam-Site Automation Reliability
> 
>
> Key: BEAM-4493
> URL: https://issues.apache.org/jira/browse/BEAM-4493
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Major
>  Labels: beam-site-automation-reliability
> Fix For: Not applicable
>
>
> https://s.apache.org/beam-site-automation



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-4504) Disconnect mergebot from apache/beam-site repository

2018-10-16 Thread Scott Wegner (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Wegner closed BEAM-4504.
--
   Resolution: Fixed
Fix Version/s: Not applicable

>  Disconnect mergebot from apache/beam-site repository
> -
>
> Key: BEAM-4504
> URL: https://issues.apache.org/jira/browse/BEAM-4504
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Major
>  Labels: beam-site-automation-reliability
> Fix For: Not applicable
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4504) Disconnect mergebot from apache/beam-site repository

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4504?focusedWorklogId=155245&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155245
 ]

ASF GitHub Bot logged work on BEAM-4504:


Author: ASF GitHub Bot
Created on: 17/Oct/18 02:16
Start Date: 17/Oct/18 02:16
Worklog Time Spent: 10m 
  Work Description: swegner closed pull request #6713: [BEAM-4504] Retire 
Jenkins jobs from apache/beam-site repository
URL: https://github.com/apache/beam/pull/6713
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/.test-infra/jenkins/CommonJobProperties.groovy 
b/.test-infra/jenkins/CommonJobProperties.groovy
index d098e1a8c7b..641cdfbd051 100644
--- a/.test-infra/jenkins/CommonJobProperties.groovy
+++ b/.test-infra/jenkins/CommonJobProperties.groovy
@@ -24,66 +24,15 @@ class CommonJobProperties {
 
   static String checkoutDir = 'src'
 
-  static void setSCM(def context, String repositoryName, boolean 
allowRemotePoll = true) {
-context.scm {
-  git {
-remote {
-  // Double quotes here mean ${repositoryName} is interpolated.
-  github("apache/${repositoryName}")
-  // Single quotes here mean that ${ghprbPullId} is not interpolated 
and instead passed
-  // through to Jenkins where it refers to the environment variable.
-  refspec('+refs/heads/*:refs/remotes/origin/* ' +
-  
'+refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*')
-}
-branch('${sha1}')
-extensions {
-  cleanAfterCheckout()
-  relativeTargetDirectory(checkoutDir)
-  if (!allowRemotePoll) {
-disableRemotePoll()
-  }
-}
-  }
-}
-  }
-
-  // Sets common top-level job properties for website repository jobs.
-  static void setTopLevelWebsiteJobProperties(def context,
-  String branch = 'asf-site',
-  int timeout = 100) {
-setTopLevelJobProperties(
-context,
-'beam-site',
-branch,
-timeout)
-  }
-
   // Sets common top-level job properties for main repository jobs.
   static void setTopLevelMainJobProperties(def context,
-   String branch = 'master',
-   int timeout = 100,
+   String defaultBranch = 'master',
+   int defaultTimeout = 100,
boolean allowRemotePoll = true,
String jenkinsExecutorLabel =  
'beam') {
-setTopLevelJobProperties(
-context,
-'beam',
-branch,
-timeout,
-allowRemotePoll,
-jenkinsExecutorLabel)
-  }
-
-  // Sets common top-level job properties. Accessed through one of the above
-  // methods to protect jobs from internal details of param defaults.
-  private static void setTopLevelJobProperties(def context,
-   String repositoryName,
-   String defaultBranch,
-   int defaultTimeout,
-   boolean allowRemotePoll = true,
-   String jenkinsExecutorLabel = 
'beam') {
 // GitHub project.
 context.properties {
-  githubProjectUrl('https://github.com/apache/' + repositoryName + '/')
+  githubProjectUrl('https://github.com/apache/beam/')
 }
 
 // Set JDK version.
@@ -98,7 +47,25 @@ class CommonJobProperties {
 }
 
 // Source code management.
-setSCM(context, repositoryName, allowRemotePoll)
+context.scm {
+  git {
+remote {
+  github("apache/beam")
+  // Single quotes here mean that ${ghprbPullId} is not interpolated 
and instead passed
+  // through to Jenkins where it refers to the environment variable.
+  refspec('+refs/heads/*:refs/remotes/origin/* ' +
+  
'+refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*')
+}
+branch('${sha1}')
+extensions {
+  cleanAfterCheckout()
+  relativeTargetDirectory(checkoutDir)
+  if (!allowRemotePoll) {
+disableRemotePoll()
+  }
+}
+  }
+}
 
 context.parameters {
   // This is a recommended setup if you want to run the job manually. The
@@ -196,14 +163,6 @@ class CommonJobProperties {
 context.switches("-Dorg.gradle.j

[jira] [Work logged] (BEAM-4130) Portable Flink runner JobService entry point in a Docker container

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4130?focusedWorklogId=155238&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155238
 ]

ASF GitHub Bot logged work on BEAM-4130:


Author: ASF GitHub Bot
Created on: 17/Oct/18 01:14
Start Date: 17/Oct/18 01:14
Worklog Time Spent: 10m 
  Work Description: tweise closed pull request #6703: [BEAM-4130] Add tests 
for FlinkJobServerDriver
URL: https://github.com/apache/beam/pull/6703
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkJobServerDriver.java
 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkJobServerDriver.java
index 34f2edb5abb..93dc6f0121c 100644
--- 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkJobServerDriver.java
+++ 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkJobServerDriver.java
@@ -17,6 +17,7 @@
  */
 package org.apache.beam.runners.flink;
 
+import com.google.common.annotations.VisibleForTesting;
 import com.google.common.util.concurrent.ListeningExecutorService;
 import com.google.common.util.concurrent.MoreExecutors;
 import com.google.common.util.concurrent.ThreadFactoryBuilder;
@@ -45,7 +46,7 @@
   private static final Logger LOG = 
LoggerFactory.getLogger(FlinkJobServerDriver.class);
 
   private final ListeningExecutorService executor;
-  private final ServerConfiguration configuration;
+  @VisibleForTesting ServerConfiguration configuration;
   private final ServerFactory jobServerFactory;
   private final ServerFactory artifactServerFactory;
   private GrpcFnServer jobServer;
@@ -54,34 +55,34 @@
   /** Configuration for the jobServer. */
   public static class ServerConfiguration {
 @Option(name = "--job-host", usage = "The job server host name")
-private String host = "";
+String host = "localhost";
 
 @Option(
   name = "--job-port",
   usage = "The job service port. 0 to use a dynamic port. (Default: 8099)"
 )
-private int port = 8099;
+int port = 8099;
 
 @Option(
   name = "--artifact-port",
   usage = "The artifact service port. 0 to use a dynamic port. (Default: 
8098)"
 )
-private int artifactPort = 8098;
+int artifactPort = 8098;
 
 @Option(name = "--artifacts-dir", usage = "The location to store staged 
artifact files")
-private String artifactStagingPath =
+String artifactStagingPath =
 Paths.get(System.getProperty("java.io.tmpdir"), 
"beam-artifact-staging").toString();
 
 @Option(
   name = "--clean-artifacts-per-job",
   usage = "When true, remove each job's staged artifacts when it completes"
 )
-private Boolean cleanArtifactsPerJob = false;
+boolean cleanArtifactsPerJob = false;
 
 @Option(name = "--flink-master-url", usage = "Flink master url to submit 
job.")
-private String flinkMasterUrl = "[auto]";
+String flinkMasterUrl = "[auto]";
 
-public String getFlinkMasterUrl() {
+String getFlinkMasterUrl() {
   return this.flinkMasterUrl;
 }
 
@@ -89,9 +90,9 @@ public String getFlinkMasterUrl() {
   name = "--sdk-worker-parallelism",
   usage = "Default parallelism for SDK worker processes (see portable 
pipeline options)"
 )
-private String sdkWorkerParallelism = 
PortablePipelineOptions.SDK_WORKER_PARALLELISM_PIPELINE;
+String sdkWorkerParallelism = 
PortablePipelineOptions.SDK_WORKER_PARALLELISM_PIPELINE;
 
-public String getSdkWorkerParallelism() {
+String getSdkWorkerParallelism() {
   return this.sdkWorkerParallelism;
 }
   }
@@ -209,7 +210,7 @@ public void stop() {
   .build();
   jobServiceGrpcFnServer = GrpcFnServer.create(service, descriptor, 
jobServerFactory);
 }
-LOG.info("JobServer started on {}", 
jobServiceGrpcFnServer.getApiServiceDescriptor().getUrl());
+LOG.info("JobService started on {}", 
jobServiceGrpcFnServer.getApiServiceDescriptor().getUrl());
 return jobServiceGrpcFnServer;
   }
 
diff --git 
a/runners/flink/src/test/java/org/apache/beam/runners/flink/FlinkJobServerDriverTest.java
 
b/runners/flink/src/test/java/org/apache/beam/runners/flink/FlinkJobServerDriverTest.java
new file mode 100644
index 000..fc44d8edf31
--- /dev/null
+++ 
b/runners/flink/src/test/java/org/apache/beam/runners/flink/FlinkJobServerDriverTest.java
@@ -0,0 +1,133 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache L

[jira] [Assigned] (BEAM-5760) Portable Flink support for maxBundleSize/maxBundleMillis

2018-10-16 Thread Thomas Weise (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Weise reassigned BEAM-5760:
--

Assignee: Thomas Weise

> Portable Flink support for maxBundleSize/maxBundleMillis
> 
>
> Key: BEAM-5760
> URL: https://issues.apache.org/jira/browse/BEAM-5760
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Affects Versions: 2.8.0
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Major
>  Labels: portability-flink
> Fix For: 2.9.0
>
>
> The portable runner needs to support larger bundles in streaming mode. 
> Currently every element is a separate bundle, which is very inefficient due 
> to the per bundle SDK worker overhead. The old Java SDK runner already 
> supports these parameters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5775) Make the spark runner not serialize data unless spark is spilling to disk

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5775?focusedWorklogId=155233&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155233
 ]

ASF GitHub Bot logged work on BEAM-5775:


Author: ASF GitHub Bot
Created on: 17/Oct/18 00:51
Start Date: 17/Oct/18 00:51
Worklog Time Spent: 10m 
  Work Description: mikekap opened a new pull request #6714: [BEAM-5775] 
Spark: implement a custom class to lazily encode values for persistence.
URL: https://github.com/apache/beam/pull/6714
 
 
   Spark's `StorageLevel` is the preferred mechanism to decide what is 
serialized when and where. With this change, Beam respects Spark's wish to keep 
data deserialized in memory, even if the storage level *may* swap to disk (e.g. 
MEMORY_AND_DISK).
   
   This PR also drive-by fixes using the `MEMORY_ONLY_2` storage level. The 
code previously assumed that no serialization was necessary, which isn't 
strictly true since the `_2` means "replicate to other nodes" - i.e. serialize 
over network.
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/)
 | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache

[jira] [Work logged] (BEAM-5058) Python precommits should run E2E tests

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5058?focusedWorklogId=155230&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155230
 ]

ASF GitHub Bot logged work on BEAM-5058:


Author: ASF GitHub Bot
Created on: 17/Oct/18 00:48
Start Date: 17/Oct/18 00:48
Worklog Time Spent: 10m 
  Work Description: udim commented on a change in pull request #6707: 
[BEAM-5058] Run basic ITs in Python Precommit
URL: https://github.com/apache/beam/pull/6707#discussion_r225749774
 
 

 ##
 File path: sdks/python/build.gradle
 ##
 @@ -226,6 +228,26 @@ task directRunnerIT(dependsOn: 'installGcpTest') {
   }
 }
 
+task precommitIT(dependsOn: ['installGcpTest', 'sdist']) {
 
 Review comment:
   This task should run in parallel to the rest of the precommit tasks. This 
can be done if it is placed in a separate sub-project. Sub-projects are created 
by creating a new build.gradle file in a subdirectory, such as 
`sdks/python/precommit/dataflow/build.gradle`. (example of making tests 
parallel: https://github.com/apache/beam/pull/5731/files)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155230)
Time Spent: 0.5h  (was: 20m)

> Python precommits should run E2E tests
> --
>
> Key: BEAM-5058
> URL: https://issues.apache.org/jira/browse/BEAM-5058
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, testing
>Reporter: Udi Meiri
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> According to [https://beam.apache.org/contribute/testing/] (which I'm working 
> on), end-to-end tests should be run in precommit on each combination of 
> \{batch, streaming}x\{SDK language}x\{supported runner}.
> At least 2 tests need to be added to Python's precommit: wordcount and 
> wordcount_streaming on Dataflow, and possibly on other supported runners 
> (direct runner and new runners plz).
>  These tests should be configured to run from a Gradle sub-project, so that 
> they're run in parallel to the unit tests.
> Example that parallelizes Java precommit integration tests: 
> [https://github.com/apache/beam/pull/5731]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5058) Python precommits should run E2E tests

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5058?focusedWorklogId=155231&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155231
 ]

ASF GitHub Bot logged work on BEAM-5058:


Author: ASF GitHub Bot
Created on: 17/Oct/18 00:48
Start Date: 17/Oct/18 00:48
Worklog Time Spent: 10m 
  Work Description: udim commented on a change in pull request #6707: 
[BEAM-5058] Run basic ITs in Python Precommit
URL: https://github.com/apache/beam/pull/6707#discussion_r225750474
 
 

 ##
 File path: sdks/python/build.gradle
 ##
 @@ -226,6 +228,26 @@ task directRunnerIT(dependsOn: 'installGcpTest') {
   }
 }
 
+task precommitIT(dependsOn: ['installGcpTest', 'sdist']) {
+  doLast {
+// List of integration tests running in Python PreCommit.
+def precommitTests = [
+"apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it",
+
"apache_beam.examples.streaming_wordcount_it_test:StreamingWordCountIT.test_streaming_wordcount_it",
+]
+def testOpts = [
 
 Review comment:
   No need for `--attr=IT`, `--nologcapture`, `--nocapture`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155231)
Time Spent: 0.5h  (was: 20m)

> Python precommits should run E2E tests
> --
>
> Key: BEAM-5058
> URL: https://issues.apache.org/jira/browse/BEAM-5058
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, testing
>Reporter: Udi Meiri
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> According to [https://beam.apache.org/contribute/testing/] (which I'm working 
> on), end-to-end tests should be run in precommit on each combination of 
> \{batch, streaming}x\{SDK language}x\{supported runner}.
> At least 2 tests need to be added to Python's precommit: wordcount and 
> wordcount_streaming on Dataflow, and possibly on other supported runners 
> (direct runner and new runners plz).
>  These tests should be configured to run from a Gradle sub-project, so that 
> they're run in parallel to the unit tests.
> Example that parallelizes Java precommit integration tests: 
> [https://github.com/apache/beam/pull/5731]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5058) Python precommits should run E2E tests

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5058?focusedWorklogId=155229&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155229
 ]

ASF GitHub Bot logged work on BEAM-5058:


Author: ASF GitHub Bot
Created on: 17/Oct/18 00:48
Start Date: 17/Oct/18 00:48
Worklog Time Spent: 10m 
  Work Description: udim commented on a change in pull request #6707: 
[BEAM-5058] Run basic ITs in Python Precommit
URL: https://github.com/apache/beam/pull/6707#discussion_r225751687
 
 

 ##
 File path: sdks/python/build.gradle
 ##
 @@ -226,6 +228,26 @@ task directRunnerIT(dependsOn: 'installGcpTest') {
   }
 }
 
+task precommitIT(dependsOn: ['installGcpTest', 'sdist']) {
+  doLast {
+// List of integration tests running in Python PreCommit.
+def precommitTests = [
+"apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it",
+
"apache_beam.examples.streaming_wordcount_it_test:StreamingWordCountIT.test_streaming_wordcount_it",
+]
+def testOpts = [
+"--tests=${precommitTests.join(',')}",
+"--processes=4",
+"--process-timeout=1800",   // Total timeout includes all tests run.
+]
+
+exec {
+  executable 'sh'
+  args '-c', ". ${envdir}/bin/activate && 
./scripts/run_integration_test.sh --test_opts \"${testOpts.join(' ')}\""
 
 Review comment:
   After parallelizing this task, please add a copy of it in 
`sdks/python/precommit/directrunner/build.gradle` with the option 
`--runner=TestDirectRunner`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155229)
Time Spent: 0.5h  (was: 20m)

> Python precommits should run E2E tests
> --
>
> Key: BEAM-5058
> URL: https://issues.apache.org/jira/browse/BEAM-5058
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, testing
>Reporter: Udi Meiri
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> According to [https://beam.apache.org/contribute/testing/] (which I'm working 
> on), end-to-end tests should be run in precommit on each combination of 
> \{batch, streaming}x\{SDK language}x\{supported runner}.
> At least 2 tests need to be added to Python's precommit: wordcount and 
> wordcount_streaming on Dataflow, and possibly on other supported runners 
> (direct runner and new runners plz).
>  These tests should be configured to run from a Gradle sub-project, so that 
> they're run in parallel to the unit tests.
> Example that parallelizes Java precommit integration tests: 
> [https://github.com/apache/beam/pull/5731]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5058) Python precommits should run E2E tests

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5058?focusedWorklogId=155232&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155232
 ]

ASF GitHub Bot logged work on BEAM-5058:


Author: ASF GitHub Bot
Created on: 17/Oct/18 00:48
Start Date: 17/Oct/18 00:48
Worklog Time Spent: 10m 
  Work Description: udim commented on a change in pull request #6707: 
[BEAM-5058] Run basic ITs in Python Precommit
URL: https://github.com/apache/beam/pull/6707#discussion_r225750634
 
 

 ##
 File path: sdks/python/build.gradle
 ##
 @@ -226,6 +228,26 @@ task directRunnerIT(dependsOn: 'installGcpTest') {
   }
 }
 
+task precommitIT(dependsOn: ['installGcpTest', 'sdist']) {
 
 Review comment:
   sdist should run before installGcpTest. I believe gradle has a rule for that.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155232)
Time Spent: 40m  (was: 0.5h)

> Python precommits should run E2E tests
> --
>
> Key: BEAM-5058
> URL: https://issues.apache.org/jira/browse/BEAM-5058
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, testing
>Reporter: Udi Meiri
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> According to [https://beam.apache.org/contribute/testing/] (which I'm working 
> on), end-to-end tests should be run in precommit on each combination of 
> \{batch, streaming}x\{SDK language}x\{supported runner}.
> At least 2 tests need to be added to Python's precommit: wordcount and 
> wordcount_streaming on Dataflow, and possibly on other supported runners 
> (direct runner and new runners plz).
>  These tests should be configured to run from a Gradle sub-project, so that 
> they're run in parallel to the unit tests.
> Example that parallelizes Java precommit integration tests: 
> [https://github.com/apache/beam/pull/5731]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4130) Portable Flink runner JobService entry point in a Docker container

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4130?focusedWorklogId=155228&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155228
 ]

ASF GitHub Bot logged work on BEAM-4130:


Author: ASF GitHub Bot
Created on: 17/Oct/18 00:32
Start Date: 17/Oct/18 00:32
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #6703: [BEAM-4130] Add tests 
for FlinkJobServerDriver
URL: https://github.com/apache/beam/pull/6703#issuecomment-430448030
 
 
   Run Java PreCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155228)
Time Spent: 14h 20m  (was: 14h 10m)

> Portable Flink runner JobService entry point in a Docker container
> --
>
> Key: BEAM-4130
> URL: https://issues.apache.org/jira/browse/BEAM-4130
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Ben Sidhom
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 2.7.0
>
>  Time Spent: 14h 20m
>  Remaining Estimate: 0h
>
> The portable Flink runner exists as a Job Service that runs somewhere. We 
> need a main entry point that itself spins up the job service (and artifact 
> staging service). The main program itself should be packaged into an uberjar 
> such that it can be run locally or submitted to a Flink deployment via `flink 
> run`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5663) Add tox suites for various Python 3 versions

2018-10-16 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652724#comment-16652724
 ] 

Valentyn Tymofieiev commented on BEAM-5663:
---

I may be wrong but I suspect for some reason `@unittest.skipif` annotation did 
not get trigger in your test suite, and then the suite ran into  BEAM-5623 
which take long to finish. The Travis logs are truncated so I could see if we 
ran those tests or not.

I tried to run py3 tox test suite from python:3.4-strech conatiner (see: 
https://s.apache.org/beam-py3-conversion-quick-start) and 117 tests didn't 
pass. The test suite finished within few minutes. 

> Add tox suites for various Python 3 versions
> 
>
> Key: BEAM-5663
> URL: https://issues.apache.org/jira/browse/BEAM-5663
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Manu Zhang
>Priority: Minor
>
> Currently, Python 3.5.2 is set up for Jenkins tests but we've seen test 
> failings across various Python 3 versions. It will be valuable to add tox 
> suites for Python 3.4, 3.5, 3.6 and 3.7



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5775) Make the spark runner not serialize data unless spark is spilling to disk

2018-10-16 Thread Mike Kaplinskiy (JIRA)
Mike Kaplinskiy created BEAM-5775:
-

 Summary: Make the spark runner not serialize data unless spark is 
spilling to disk
 Key: BEAM-5775
 URL: https://issues.apache.org/jira/browse/BEAM-5775
 Project: Beam
  Issue Type: Improvement
  Components: runner-spark
Reporter: Mike Kaplinskiy
Assignee: Amit Sela


Currently for storage level MEMORY_ONLY, Beam does not coder-ify the data. This 
lets Spark keep the data in memory avoiding the serialization round trip. 
Unfortunately the logic is fairly coarse - as soon as you switch to 
MEMORY_AND_DISK, Beam coder-ifys the data even though Spark might have chosen 
to keep the data in memory, incurring the serialization overhead.

 

Ideally Beam would serialize the data lazily - as Spark chooses to spill to 
disk. This would be a change in behavior when using beam, but luckily Spark has 
a solution for folks that want data serialized in memory - MEMORY_AND_DISK_SER 
will keep the data serialized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5741) Move "Contact Us" to a top-level link

2018-10-16 Thread Melissa Pashniak (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652719#comment-16652719
 ] 

Melissa Pashniak commented on BEAM-5741:


Another possible option is removing or combining existing nav item(s)?  But I'm 
not sure which we'd want to remove/combine as they all seem useful.

 

> Move "Contact Us" to a top-level link
> -
>
> Key: BEAM-5741
> URL: https://issues.apache.org/jira/browse/BEAM-5741
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Scott Wegner
>Priority: Major
>
> It should be very easy to figure out how to get in touch with community. 
> "Contact Us" should be a top-level link on the page.
> The page can also be improved with:
> * Some basic text on how to use subscribe / unsubscribe links
> * Recommendations on how to use various communications channels (Slack for 
> quick questions, dev@ for longer conversations. And all decisions should make 
> it back to dev@)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-5741) Move "Contact Us" to a top-level link

2018-10-16 Thread Melissa Pashniak (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652712#comment-16652712
 ] 

Melissa Pashniak edited comment on BEAM-5741 at 10/17/18 12:15 AM:
---

Would this be an addition section, or replace the Community section? I don't 
really agree with adding yet another item, there are two aspects here:

1) We have done a lot of rearranging of the navigation on the site, and as part 
of this, we looked at many sites when we landed on our current breakdown. On a 
majority of the sites that use a top nav structure, the contact us/mailing list 
pages are in a nav item "Community" (for example many Apache sites - Spark, 
Apex, Hadoop, Gearpump, and other big projects such as Tensorflow, Kubernetes, 
etc.)  Because of this, we used the same "Community" terminology for 
consistency, and made the contact us page the default page that shows up when 
someone chooses Community. We used to have pull-down menus on the top nav, but 
we received feedback that it caused trouble for mobile devices because the 
menus were too long. We could attempt to put that back and only have a small 
subset of options, though it might be confusing to show some things but not all 
unless they click on the item. We could also move to a permanent static left 
nav structure to show more items at once (such as Flink, which has a "Getting 
help" page that is always visible), but then we'd lose the section-specific 
left nav when you choose a top nav item.

2) The other issue is one of remaining horizontal space. I am looking into 
adding searching capability/a search bar for the site, which would take up a 
big chunk of the remaining space after the top nav items. we are already 
nearing (imo) too many top level nav items. (some actually weren't enthused 
with how many are there even now)

 


was (Author: melap):
Would this replace the Community section? I don't really agree with this, there 
are two aspects here:

1) We have done a lot of rearranging of the navigation on the site, and as part 
of this, we looked at many sites when we landed on our current breakdown. On a 
majority of the sites that use a top nav structure, the contact us/mailing list 
pages are in a nav item "Community" (for example many Apache sites - Spark, 
Apex, Hadoop, Gearpump, and other big projects such as Tensorflow, Kubernetes, 
etc.)  Because of this, we used the same "Community" terminology for 
consistency, and made the contact us page the default page that shows up when 
someone chooses Community. We used to have pull-down menus on the top nav, but 
we received feedback that it caused trouble for mobile devices because the 
menus were too long. We could attempt to put that back and only have a small 
subset of options, though it might be confusing to show some things but not all 
unless they click on the item. We could also move to a permanent static left 
nav structure to show more items at once (such as Flink, which has a "Getting 
help" page that is always visible), but then we'd lose the section-specific 
left nav when you choose a top nav item.

2) The other issue is one of remaining horizontal space. I am looking into 
adding searching capability/a search bar for the site, which would take up a 
big chunk of the remaining space after the top nav items. we are already 
nearing (imo) too many top level nav items. (some actually weren't enthused 
with how many are there even now)

 

> Move "Contact Us" to a top-level link
> -
>
> Key: BEAM-5741
> URL: https://issues.apache.org/jira/browse/BEAM-5741
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Scott Wegner
>Priority: Major
>
> It should be very easy to figure out how to get in touch with community. 
> "Contact Us" should be a top-level link on the page.
> The page can also be improved with:
> * Some basic text on how to use subscribe / unsubscribe links
> * Recommendations on how to use various communications channels (Slack for 
> quick questions, dev@ for longer conversations. And all decisions should make 
> it back to dev@)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-5741) Move "Contact Us" to a top-level link

2018-10-16 Thread Melissa Pashniak (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652712#comment-16652712
 ] 

Melissa Pashniak edited comment on BEAM-5741 at 10/17/18 12:15 AM:
---

Would this be an additional section, or replace the Community section? I don't 
really agree with adding yet another item, there are two aspects here:

1) We have done a lot of rearranging of the navigation on the site, and as part 
of this, we looked at many sites when we landed on our current breakdown. On a 
majority of the sites that use a top nav structure, the contact us/mailing list 
pages are in a nav item "Community" (for example many Apache sites - Spark, 
Apex, Hadoop, Gearpump, and other big projects such as Tensorflow, Kubernetes, 
etc.)  Because of this, we used the same "Community" terminology for 
consistency, and made the contact us page the default page that shows up when 
someone chooses Community. We used to have pull-down menus on the top nav, but 
we received feedback that it caused trouble for mobile devices because the 
menus were too long. We could attempt to put that back and only have a small 
subset of options, though it might be confusing to show some things but not all 
unless they click on the item. We could also move to a permanent static left 
nav structure to show more items at once (such as Flink, which has a "Getting 
help" page that is always visible), but then we'd lose the section-specific 
left nav when you choose a top nav item.

2) The other issue is one of remaining horizontal space. I am looking into 
adding searching capability/a search bar for the site, which would take up a 
big chunk of the remaining space after the top nav items. we are already 
nearing (imo) too many top level nav items. (some actually weren't enthused 
with how many are there even now)

 


was (Author: melap):
Would this be an addition section, or replace the Community section? I don't 
really agree with adding yet another item, there are two aspects here:

1) We have done a lot of rearranging of the navigation on the site, and as part 
of this, we looked at many sites when we landed on our current breakdown. On a 
majority of the sites that use a top nav structure, the contact us/mailing list 
pages are in a nav item "Community" (for example many Apache sites - Spark, 
Apex, Hadoop, Gearpump, and other big projects such as Tensorflow, Kubernetes, 
etc.)  Because of this, we used the same "Community" terminology for 
consistency, and made the contact us page the default page that shows up when 
someone chooses Community. We used to have pull-down menus on the top nav, but 
we received feedback that it caused trouble for mobile devices because the 
menus were too long. We could attempt to put that back and only have a small 
subset of options, though it might be confusing to show some things but not all 
unless they click on the item. We could also move to a permanent static left 
nav structure to show more items at once (such as Flink, which has a "Getting 
help" page that is always visible), but then we'd lose the section-specific 
left nav when you choose a top nav item.

2) The other issue is one of remaining horizontal space. I am looking into 
adding searching capability/a search bar for the site, which would take up a 
big chunk of the remaining space after the top nav items. we are already 
nearing (imo) too many top level nav items. (some actually weren't enthused 
with how many are there even now)

 

> Move "Contact Us" to a top-level link
> -
>
> Key: BEAM-5741
> URL: https://issues.apache.org/jira/browse/BEAM-5741
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Scott Wegner
>Priority: Major
>
> It should be very easy to figure out how to get in touch with community. 
> "Contact Us" should be a top-level link on the page.
> The page can also be improved with:
> * Some basic text on how to use subscribe / unsubscribe links
> * Recommendations on how to use various communications channels (Slack for 
> quick questions, dev@ for longer conversations. And all decisions should make 
> it back to dev@)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5741) Move "Contact Us" to a top-level link

2018-10-16 Thread Melissa Pashniak (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652712#comment-16652712
 ] 

Melissa Pashniak commented on BEAM-5741:


Would this replace the Community section? I don't really agree with this, there 
are two aspects here:

1) We have done a lot of rearranging of the navigation on the site, and as part 
of this, we looked at many sites when we landed on our current breakdown. On a 
majority of the sites that use a top nav structure, the contact us/mailing list 
pages are in a nav item "Community" (for example many Apache sites - Spark, 
Apex, Hadoop, Gearpump, and other big projects such as Tensorflow, Kubernetes, 
etc.)  Because of this, we used the same "Community" terminology for 
consistency, and made the contact us page the default page that shows up when 
someone chooses Community. We used to have pull-down menus on the top nav, but 
we received feedback that it caused trouble for mobile devices because the 
menus were too long. We could attempt to put that back and only have a small 
subset of options, though it might be confusing to show some things but not all 
unless they click on the item. We could also move to a permanent static left 
nav structure to show more items at once (such as Flink, which has a "Getting 
help" page that is always visible), but then we'd lose the section-specific 
left nav when you choose a top nav item.

2) The other issue is one of remaining horizontal space. I am looking into 
adding searching capability/a search bar for the site, which would take up a 
big chunk of the remaining space after the top nav items. we are already 
nearing (imo) too many top level nav items. (some actually weren't enthused 
with how many are there even now)

 

> Move "Contact Us" to a top-level link
> -
>
> Key: BEAM-5741
> URL: https://issues.apache.org/jira/browse/BEAM-5741
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Scott Wegner
>Priority: Major
>
> It should be very easy to figure out how to get in touch with community. 
> "Contact Us" should be a top-level link on the page.
> The page can also be improved with:
> * Some basic text on how to use subscribe / unsubscribe links
> * Recommendations on how to use various communications channels (Slack for 
> quick questions, dev@ for longer conversations. And all decisions should make 
> it back to dev@)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4504) Disconnect mergebot from apache/beam-site repository

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4504?focusedWorklogId=155221&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155221
 ]

ASF GitHub Bot logged work on BEAM-4504:


Author: ASF GitHub Bot
Created on: 16/Oct/18 23:49
Start Date: 16/Oct/18 23:49
Worklog Time Spent: 10m 
  Work Description: swegner commented on issue #6713: [BEAM-4504] Retire 
Jenkins jobs from apache/beam-site repository
URL: https://github.com/apache/beam/pull/6713#issuecomment-430440123
 
 
   Seed job [succeeded](https://builds.apache.org/job/beam_SeedJob/2809/).
   
   R: @Ardagan 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155221)
Time Spent: 0.5h  (was: 20m)

>  Disconnect mergebot from apache/beam-site repository
> -
>
> Key: BEAM-4504
> URL: https://issues.apache.org/jira/browse/BEAM-4504
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Major
>  Labels: beam-site-automation-reliability
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5741) Move "Contact Us" to a top-level link

2018-10-16 Thread Scott Wegner (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652674#comment-16652674
 ] 

Scott Wegner commented on BEAM-5741:


Yes, the feedback we got was that one of the most important pieces of 
documentation is how to reach out to the community. The 'contact-us' page is 
pretty good, but finding it is a bit difficult. It would be useful as a 
top-level link.

/cc [~rohdesam]

> Move "Contact Us" to a top-level link
> -
>
> Key: BEAM-5741
> URL: https://issues.apache.org/jira/browse/BEAM-5741
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Scott Wegner
>Priority: Major
>
> It should be very easy to figure out how to get in touch with community. 
> "Contact Us" should be a top-level link on the page.
> The page can also be improved with:
> * Some basic text on how to use subscribe / unsubscribe links
> * Recommendations on how to use various communications channels (Slack for 
> quick questions, dev@ for longer conversations. And all decisions should make 
> it back to dev@)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4504) Disconnect mergebot from apache/beam-site repository

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4504?focusedWorklogId=155219&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155219
 ]

ASF GitHub Bot logged work on BEAM-4504:


Author: ASF GitHub Bot
Created on: 16/Oct/18 23:45
Start Date: 16/Oct/18 23:45
Worklog Time Spent: 10m 
  Work Description: swegner opened a new pull request #6713: [BEAM-4504] 
Retire Jenkins jobs from apache/beam-site repository
URL: https://github.com/apache/beam/pull/6713
 
 
   Website sources have been moved to apache/beam repository. This
   cleans up the Jenkins job definitions and removes some common code that
   was only used by those jobs.
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/)
 | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155219)
Time Spent: 10m
Remaining Estimate: 0h

>  Disconnect mergebot from apache/beam-site repository
> -
>
> Key: BEAM-4504
> URL: https://issues.apache.org/jira/

[jira] [Work logged] (BEAM-4504) Disconnect mergebot from apache/beam-site repository

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4504?focusedWorklogId=155220&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155220
 ]

ASF GitHub Bot logged work on BEAM-4504:


Author: ASF GitHub Bot
Created on: 16/Oct/18 23:45
Start Date: 16/Oct/18 23:45
Worklog Time Spent: 10m 
  Work Description: swegner commented on issue #6713: [BEAM-4504] Retire 
Jenkins jobs from apache/beam-site repository
URL: https://github.com/apache/beam/pull/6713#issuecomment-430439411
 
 
   Run Seed Job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155220)
Time Spent: 20m  (was: 10m)

>  Disconnect mergebot from apache/beam-site repository
> -
>
> Key: BEAM-4504
> URL: https://issues.apache.org/jira/browse/BEAM-4504
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Major
>  Labels: beam-site-automation-reliability
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5663) Add tox suites for various Python 3 versions

2018-10-16 Thread Manu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652641#comment-16652641
 ] 

Manu Zhang commented on BEAM-5663:
--

[~tvalentyn], I simply run "./gradlew testPython3" for each environment. The 
Python 3.4 test is much much longer. Is there a flag that doesn't work in 3.4 ?

> Add tox suites for various Python 3 versions
> 
>
> Key: BEAM-5663
> URL: https://issues.apache.org/jira/browse/BEAM-5663
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Manu Zhang
>Priority: Minor
>
> Currently, Python 3.5.2 is set up for Jenkins tests but we've seen test 
> failings across various Python 3 versions. It will be valuable to add tox 
> suites for Python 3.4, 3.5, 3.6 and 3.7



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5240) Create post-commit tests dashboard

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5240?focusedWorklogId=155209&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155209
 ]

ASF GitHub Bot logged work on BEAM-5240:


Author: ASF GitHub Bot
Created on: 16/Oct/18 23:19
Start Date: 16/Oct/18 23:19
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on issue #6711: [BEAM-5240] Add Jira 
data to Beam post-commits dashboard
URL: https://github.com/apache/beam/pull/6711#issuecomment-430434590
 
 
   run python precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155209)
Time Spent: 5h 10m  (was: 5h)

> Create post-commit tests dashboard
> --
>
> Key: BEAM-5240
> URL: https://issues.apache.org/jira/browse/BEAM-5240
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5240) Create post-commit tests dashboard

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5240?focusedWorklogId=155208&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155208
 ]

ASF GitHub Bot logged work on BEAM-5240:


Author: ASF GitHub Bot
Created on: 16/Oct/18 23:19
Start Date: 16/Oct/18 23:19
Worklog Time Spent: 10m 
  Work Description: Ardagan removed a comment on issue #6711: [BEAM-5240] 
Add Jira data to Beam post-commits dashboard
URL: https://github.com/apache/beam/pull/6711#issuecomment-430429358
 
 
   run go precommits


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155208)
Time Spent: 5h  (was: 4h 50m)

> Create post-commit tests dashboard
> --
>
> Key: BEAM-5240
> URL: https://issues.apache.org/jira/browse/BEAM-5240
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5240) Create post-commit tests dashboard

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5240?focusedWorklogId=155211&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155211
 ]

ASF GitHub Bot logged work on BEAM-5240:


Author: ASF GitHub Bot
Created on: 16/Oct/18 23:19
Start Date: 16/Oct/18 23:19
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on issue #6711: [BEAM-5240] Add Jira 
data to Beam post-commits dashboard
URL: https://github.com/apache/beam/pull/6711#issuecomment-430434708
 
 
   Running precommits to execute :rat target. Need to implement BEAM-5499 to 
avoid it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155211)
Time Spent: 5h 20m  (was: 5h 10m)

> Create post-commit tests dashboard
> --
>
> Key: BEAM-5240
> URL: https://issues.apache.org/jira/browse/BEAM-5240
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5240) Create post-commit tests dashboard

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5240?focusedWorklogId=155206&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155206
 ]

ASF GitHub Bot logged work on BEAM-5240:


Author: ASF GitHub Bot
Created on: 16/Oct/18 23:19
Start Date: 16/Oct/18 23:19
Worklog Time Spent: 10m 
  Work Description: Ardagan removed a comment on issue #6711: [BEAM-5240] 
Add Jira data to Beam post-commits dashboard
URL: https://github.com/apache/beam/pull/6711#issuecomment-430429424
 
 
   run python precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155206)
Time Spent: 4h 40m  (was: 4.5h)

> Create post-commit tests dashboard
> --
>
> Key: BEAM-5240
> URL: https://issues.apache.org/jira/browse/BEAM-5240
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5240) Create post-commit tests dashboard

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5240?focusedWorklogId=155207&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155207
 ]

ASF GitHub Bot logged work on BEAM-5240:


Author: ASF GitHub Bot
Created on: 16/Oct/18 23:19
Start Date: 16/Oct/18 23:19
Worklog Time Spent: 10m 
  Work Description: Ardagan removed a comment on issue #6711: [BEAM-5240] 
Add Jira data to Beam post-commits dashboard
URL: https://github.com/apache/beam/pull/6711#issuecomment-430429390
 
 
   run go precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155207)
Time Spent: 4h 50m  (was: 4h 40m)

> Create post-commit tests dashboard
> --
>
> Key: BEAM-5240
> URL: https://issues.apache.org/jira/browse/BEAM-5240
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5609) Improve Grafana dashboard: Add local testing infrastructure

2018-10-16 Thread Mikhail Gryzykhin (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652630#comment-16652630
 ] 

Mikhail Gryzykhin commented on BEAM-5609:
-

I believe this is covered by .test-infra/metrics/docker-compose.yml by now. 
https://github.com/apache/beam/blob/master/.test-infra/metrics/docker-compose.yml

It spins up whole service including data fetching.

Will resolve ticket.

> Improve Grafana dashboard: Add local testing infrastructure
> ---
>
> Key: BEAM-5609
> URL: https://issues.apache.org/jira/browse/BEAM-5609
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Pablo Estrada
>Assignee: Mikhail Gryzykhin
>Priority: Major
> Fix For: Not applicable
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5609) Improve Grafana dashboard: Add local testing infrastructure

2018-10-16 Thread Mikhail Gryzykhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Gryzykhin resolved BEAM-5609.
-
   Resolution: Fixed
Fix Version/s: Not applicable

> Improve Grafana dashboard: Add local testing infrastructure
> ---
>
> Key: BEAM-5609
> URL: https://issues.apache.org/jira/browse/BEAM-5609
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Pablo Estrada
>Assignee: Mikhail Gryzykhin
>Priority: Major
> Fix For: Not applicable
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5663) Add tox suites for various Python 3 versions

2018-10-16 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652629#comment-16652629
 ] 

Valentyn Tymofieiev commented on BEAM-5663:
---

[~mauzhang] It also seems that your runs also included some tests that we 
currently skip in Python 3, for example I think 3.4 logs include a skipped test 
apache_beam.runners.portability.fn_api_runner_test.FnApiRunnerTestWithGrpc.test_pardo_metrics
 

> Add tox suites for various Python 3 versions
> 
>
> Key: BEAM-5663
> URL: https://issues.apache.org/jira/browse/BEAM-5663
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Manu Zhang
>Priority: Minor
>
> Currently, Python 3.5.2 is set up for Jenkins tests but we've seen test 
> failings across various Python 3 versions. It will be valuable to add tox 
> suites for Python 3.4, 3.5, 3.6 and 3.7



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5774) beam_Release_Gradle_NightlySnapshot timed out

2018-10-16 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652623#comment-16652623
 ] 

Kenneth Knowles commented on BEAM-5774:
---

No, it appears to be a plain-and-simple timeout.

> beam_Release_Gradle_NightlySnapshot timed out
> -
>
> Key: BEAM-5774
> URL: https://issues.apache.org/jira/browse/BEAM-5774
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Critical
>
> https://builds.apache.org/job/beam_Release_Gradle_NightlySnapshot/209/
> Looking at the trend, this is not surprising: 
> https://builds.apache.org/job/beam_Release_Gradle_NightlySnapshot/buildTimeTrend



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5774) beam_Release_Gradle_NightlySnapshot timed out

2018-10-16 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652622#comment-16652622
 ] 

Kenneth Knowles commented on BEAM-5774:
---

TBD whether this is BEAM-5249

> beam_Release_Gradle_NightlySnapshot timed out
> -
>
> Key: BEAM-5774
> URL: https://issues.apache.org/jira/browse/BEAM-5774
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Critical
>
> https://builds.apache.org/job/beam_Release_Gradle_NightlySnapshot/209/
> Looking at the trend, this is not surprising: 
> https://builds.apache.org/job/beam_Release_Gradle_NightlySnapshot/buildTimeTrend



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-5663) Add tox suites for various Python 3 versions

2018-10-16 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652621#comment-16652621
 ] 

Valentyn Tymofieiev edited comment on BEAM-5663 at 10/16/18 11:14 PM:
--

Thanks, [~mauzhang]. I looked at the logs, and also verifed myself that some 
tests that pass on Python 3.5 on Jenkins, fail in other versions of the 
interpreter. FYI [~matthiasml6] [~RobbeSneyders] [~splovyt] [~Juta].

 For example:

python ./setup.py test -s 
apache_beam.typehints.typed_pipeline_test.SideInputTest.test_basic_side_input_hint
  fails on Python 3.4 with:

==
ERROR: test_basic_side_input_hint 
(apache_beam.typehints.typed_pipeline_test.SideInputTest)
--
Traceback (most recent call last):
  File "/beam/sdks/python/apache_beam/typehints/typed_pipeline_test.py", line 
173, in test_basic_side_input_hint
self._run_repeat_test(repeat)
  File "/beam/sdks/python/apache_beam/typehints/typed_pipeline_test.py", line 
144, in _run_repeat_test
self._run_repeat_test_good(repeat)
  File "/beam/sdks/python/apache_beam/options/pipeline_options.py", line 803, 
in wrapper
f(*args, **kwargs)
  File "/beam/sdks/python/apache_beam/typehints/typed_pipeline_test.py", line 
150, in _run_repeat_test_good
result = ['a', 'bb', 'c'] | beam.Map(repeat, 3)
  File "/beam/sdks/python/apache_beam/transforms/ptransform.py", line 496, in 
__ror__
p.run().wait_until_finish()
  File "/beam/sdks/python/apache_beam/pipeline.py", line 403, in run
self.to_runner_api(), self.runner, self._options).run(False)
  File "/beam/sdks/python/apache_beam/pipeline.py", line 416, in run
return self.runner.run_pipeline(self)
  File "/beam/sdks/python/apache_beam/runners/direct/direct_runner.py", line 
139, in run_pipeline
return runner.run_pipeline(pipeline)
  File "/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py", 
line 231, in run_pipeline
return self.run_via_runner_api(pipeline.to_runner_api())
  File "/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py", 
line 234, in run_via_runner_api
return self.run_stages(*self.create_stages(pipeline_proto))
  File "/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py", 
line 967, in create_stages
pcoll.coder_id = coders.get_id(coder)
  File "/beam/sdks/python/apache_beam/runners/pipeline_context.py", line 79, in 
get_id
self._id_to_proto[id] = obj.to_runner_api(self._pipeline_context)
  File "/beam/sdks/python/apache_beam/coders/coders.py", line 259, in 
to_runner_api
component_coder_ids=[context.coders.get_id(c) for c in components])
  File "/beam/sdks/python/apache_beam/coders/coders.py", line 259, in 
component_coder_ids=[context.coders.get_id(c) for c in components])
  File "/beam/sdks/python/apache_beam/runners/pipeline_context.py", line 79, in 
get_id
self._id_to_proto[id] = obj.to_runner_api(self._pipeline_context)
  File "/beam/sdks/python/apache_beam/coders/coders.py", line 250, in 
to_runner_api
urn, typed_param, components = self.to_runner_api_parameter(context)
  File "/beam/sdks/python/apache_beam/coders/coders.py", line 276, in 
to_runner_api_parameter
google.protobuf.wrappers_pb2.BytesValue(value=serialize_coder(self)),
  File "/beam/sdks/python/apache_beam/coders/coders.py", line 67, in 
serialize_coder
pickler.dumps(coder))
TypeError: unsupported operand type(s) for %: 'bytes' and 'tuple'



was (Author: tvalentyn):
Thanks, [~mauzhang]. I looked at the logs, and also verifed myself that some 
tests that pass on Python 3.5 on Jenkins, fail on Python 3.4. FYI 
[~matthiasml6] [~RobbeSneyders] [~splovyt] [~Juta].

 For example:

python ./setup.py test -s 
apache_beam.typehints.typed_pipeline_test.SideInputTest.test_basic_side_input_hint
  fails on Python 3.4 with:

==
ERROR: test_basic_side_input_hint 
(apache_beam.typehints.typed_pipeline_test.SideInputTest)
--
Traceback (most recent call last):
  File "/beam/sdks/python/apache_beam/typehints/typed_pipeline_test.py", line 
173, in test_basic_side_input_hint
self._run_repeat_test(repeat)
  File "/beam/sdks/python/apache_beam/typehints/typed_pipeline_test.py", line 
144, in _run_repeat_test
self._run_repeat_test_good(repeat)
  File "/beam/sdks/python/apache_beam/options/pipeline_options.py", line 803, 
in wrapper
f(*args, **kwargs)
  File "/beam/sdks/python/apache_beam/typehints/typed_pipeline_test.py", line 
150, in _run_repeat_test_good
result = ['a', 'bb', 'c'] | beam.Map(repeat, 3)
  File "/beam/sdks/python/apache_beam/transforms/ptransform.py", line 496, in 
__ror__
p.run().wait_until_finish()
  File "/beam/sdks/python/a

[jira] [Commented] (BEAM-5663) Add tox suites for various Python 3 versions

2018-10-16 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652621#comment-16652621
 ] 

Valentyn Tymofieiev commented on BEAM-5663:
---

Thanks, [~mauzhang]. I verifed that some tests that pass on Python 3.5 on 
Jenkins, fail on Python 3.4. FYI [~matthiasml6] [~RobbeSneyders] [~splovyt] 
[~Juta].

 For example:

python ./setup.py test -s 
apache_beam.typehints.typed_pipeline_test.SideInputTest.test_basic_side_input_hint
  fails on Python 3.4 with:

==
ERROR: test_basic_side_input_hint 
(apache_beam.typehints.typed_pipeline_test.SideInputTest)
--
Traceback (most recent call last):
  File "/beam/sdks/python/apache_beam/typehints/typed_pipeline_test.py", line 
173, in test_basic_side_input_hint
self._run_repeat_test(repeat)
  File "/beam/sdks/python/apache_beam/typehints/typed_pipeline_test.py", line 
144, in _run_repeat_test
self._run_repeat_test_good(repeat)
  File "/beam/sdks/python/apache_beam/options/pipeline_options.py", line 803, 
in wrapper
f(*args, **kwargs)
  File "/beam/sdks/python/apache_beam/typehints/typed_pipeline_test.py", line 
150, in _run_repeat_test_good
result = ['a', 'bb', 'c'] | beam.Map(repeat, 3)
  File "/beam/sdks/python/apache_beam/transforms/ptransform.py", line 496, in 
__ror__
p.run().wait_until_finish()
  File "/beam/sdks/python/apache_beam/pipeline.py", line 403, in run
self.to_runner_api(), self.runner, self._options).run(False)
  File "/beam/sdks/python/apache_beam/pipeline.py", line 416, in run
return self.runner.run_pipeline(self)
  File "/beam/sdks/python/apache_beam/runners/direct/direct_runner.py", line 
139, in run_pipeline
return runner.run_pipeline(pipeline)
  File "/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py", 
line 231, in run_pipeline
return self.run_via_runner_api(pipeline.to_runner_api())
  File "/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py", 
line 234, in run_via_runner_api
return self.run_stages(*self.create_stages(pipeline_proto))
  File "/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py", 
line 967, in create_stages
pcoll.coder_id = coders.get_id(coder)
  File "/beam/sdks/python/apache_beam/runners/pipeline_context.py", line 79, in 
get_id
self._id_to_proto[id] = obj.to_runner_api(self._pipeline_context)
  File "/beam/sdks/python/apache_beam/coders/coders.py", line 259, in 
to_runner_api
component_coder_ids=[context.coders.get_id(c) for c in components])
  File "/beam/sdks/python/apache_beam/coders/coders.py", line 259, in 
component_coder_ids=[context.coders.get_id(c) for c in components])
  File "/beam/sdks/python/apache_beam/runners/pipeline_context.py", line 79, in 
get_id
self._id_to_proto[id] = obj.to_runner_api(self._pipeline_context)
  File "/beam/sdks/python/apache_beam/coders/coders.py", line 250, in 
to_runner_api
urn, typed_param, components = self.to_runner_api_parameter(context)
  File "/beam/sdks/python/apache_beam/coders/coders.py", line 276, in 
to_runner_api_parameter
google.protobuf.wrappers_pb2.BytesValue(value=serialize_coder(self)),
  File "/beam/sdks/python/apache_beam/coders/coders.py", line 67, in 
serialize_coder
pickler.dumps(coder))
TypeError: unsupported operand type(s) for %: 'bytes' and 'tuple'


> Add tox suites for various Python 3 versions
> 
>
> Key: BEAM-5663
> URL: https://issues.apache.org/jira/browse/BEAM-5663
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Manu Zhang
>Priority: Minor
>
> Currently, Python 3.5.2 is set up for Jenkins tests but we've seen test 
> failings across various Python 3 versions. It will be valuable to add tox 
> suites for Python 3.4, 3.5, 3.6 and 3.7



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-5663) Add tox suites for various Python 3 versions

2018-10-16 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652621#comment-16652621
 ] 

Valentyn Tymofieiev edited comment on BEAM-5663 at 10/16/18 11:13 PM:
--

Thanks, [~mauzhang]. I looked at the logs, and also verifed myself that some 
tests that pass on Python 3.5 on Jenkins, fail on Python 3.4. FYI 
[~matthiasml6] [~RobbeSneyders] [~splovyt] [~Juta].

 For example:

python ./setup.py test -s 
apache_beam.typehints.typed_pipeline_test.SideInputTest.test_basic_side_input_hint
  fails on Python 3.4 with:

==
ERROR: test_basic_side_input_hint 
(apache_beam.typehints.typed_pipeline_test.SideInputTest)
--
Traceback (most recent call last):
  File "/beam/sdks/python/apache_beam/typehints/typed_pipeline_test.py", line 
173, in test_basic_side_input_hint
self._run_repeat_test(repeat)
  File "/beam/sdks/python/apache_beam/typehints/typed_pipeline_test.py", line 
144, in _run_repeat_test
self._run_repeat_test_good(repeat)
  File "/beam/sdks/python/apache_beam/options/pipeline_options.py", line 803, 
in wrapper
f(*args, **kwargs)
  File "/beam/sdks/python/apache_beam/typehints/typed_pipeline_test.py", line 
150, in _run_repeat_test_good
result = ['a', 'bb', 'c'] | beam.Map(repeat, 3)
  File "/beam/sdks/python/apache_beam/transforms/ptransform.py", line 496, in 
__ror__
p.run().wait_until_finish()
  File "/beam/sdks/python/apache_beam/pipeline.py", line 403, in run
self.to_runner_api(), self.runner, self._options).run(False)
  File "/beam/sdks/python/apache_beam/pipeline.py", line 416, in run
return self.runner.run_pipeline(self)
  File "/beam/sdks/python/apache_beam/runners/direct/direct_runner.py", line 
139, in run_pipeline
return runner.run_pipeline(pipeline)
  File "/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py", 
line 231, in run_pipeline
return self.run_via_runner_api(pipeline.to_runner_api())
  File "/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py", 
line 234, in run_via_runner_api
return self.run_stages(*self.create_stages(pipeline_proto))
  File "/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py", 
line 967, in create_stages
pcoll.coder_id = coders.get_id(coder)
  File "/beam/sdks/python/apache_beam/runners/pipeline_context.py", line 79, in 
get_id
self._id_to_proto[id] = obj.to_runner_api(self._pipeline_context)
  File "/beam/sdks/python/apache_beam/coders/coders.py", line 259, in 
to_runner_api
component_coder_ids=[context.coders.get_id(c) for c in components])
  File "/beam/sdks/python/apache_beam/coders/coders.py", line 259, in 
component_coder_ids=[context.coders.get_id(c) for c in components])
  File "/beam/sdks/python/apache_beam/runners/pipeline_context.py", line 79, in 
get_id
self._id_to_proto[id] = obj.to_runner_api(self._pipeline_context)
  File "/beam/sdks/python/apache_beam/coders/coders.py", line 250, in 
to_runner_api
urn, typed_param, components = self.to_runner_api_parameter(context)
  File "/beam/sdks/python/apache_beam/coders/coders.py", line 276, in 
to_runner_api_parameter
google.protobuf.wrappers_pb2.BytesValue(value=serialize_coder(self)),
  File "/beam/sdks/python/apache_beam/coders/coders.py", line 67, in 
serialize_coder
pickler.dumps(coder))
TypeError: unsupported operand type(s) for %: 'bytes' and 'tuple'



was (Author: tvalentyn):
Thanks, [~mauzhang]. I verifed that some tests that pass on Python 3.5 on 
Jenkins, fail on Python 3.4. FYI [~matthiasml6] [~RobbeSneyders] [~splovyt] 
[~Juta].

 For example:

python ./setup.py test -s 
apache_beam.typehints.typed_pipeline_test.SideInputTest.test_basic_side_input_hint
  fails on Python 3.4 with:

==
ERROR: test_basic_side_input_hint 
(apache_beam.typehints.typed_pipeline_test.SideInputTest)
--
Traceback (most recent call last):
  File "/beam/sdks/python/apache_beam/typehints/typed_pipeline_test.py", line 
173, in test_basic_side_input_hint
self._run_repeat_test(repeat)
  File "/beam/sdks/python/apache_beam/typehints/typed_pipeline_test.py", line 
144, in _run_repeat_test
self._run_repeat_test_good(repeat)
  File "/beam/sdks/python/apache_beam/options/pipeline_options.py", line 803, 
in wrapper
f(*args, **kwargs)
  File "/beam/sdks/python/apache_beam/typehints/typed_pipeline_test.py", line 
150, in _run_repeat_test_good
result = ['a', 'bb', 'c'] | beam.Map(repeat, 3)
  File "/beam/sdks/python/apache_beam/transforms/ptransform.py", line 496, in 
__ror__
p.run().wait_until_finish()
  File "/beam/sdks/python/apache_beam/pipeline.py", line 403, in run
self.to_runne

[jira] [Commented] (BEAM-5057) beam_Release_Gradle_NightlySnapshot failing due to a Javadoc error

2018-10-16 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652620#comment-16652620
 ] 

Kenneth Knowles commented on BEAM-5057:
---

Is this now obsolete?

> beam_Release_Gradle_NightlySnapshot failing due to a Javadoc error
> --
>
> Key: BEAM-5057
> URL: https://issues.apache.org/jira/browse/BEAM-5057
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> [https://builds.apache.org/job/beam_Release_Gradle_NightlySnapshot/127/console]
> [https://builds.apache.org/job/beam_Release_Gradle_NightlySnapshot/125/console]
>  
> * What went wrong:
> Execution failed for task ':beam-sdks-java-core:javadoc'.
> > Javadoc generation failed. Generated Javadoc options file (useful for 
> > troubleshooting): 
> > '/home/jenkins/jenkins-slave/workspace/beam_Release_Gradle_NightlySnapshot/src/sdks/java/core/build/tmp/javadoc/javadoc.options'
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5774) beam_Release_Gradle_NightlySnapshot timed out

2018-10-16 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-5774:
-

 Summary: beam_Release_Gradle_NightlySnapshot timed out
 Key: BEAM-5774
 URL: https://issues.apache.org/jira/browse/BEAM-5774
 Project: Beam
  Issue Type: Bug
  Components: build-system
Reporter: Kenneth Knowles
Assignee: Kenneth Knowles


https://builds.apache.org/job/beam_Release_Gradle_NightlySnapshot/209/

Looking at the trend, this is not surprising: 
https://builds.apache.org/job/beam_Release_Gradle_NightlySnapshot/buildTimeTrend



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5773) Failure in beam_PostCommit_Py_VR_Dataflow "There is insufficient memory for the Java Runtime Environment to continue."

2018-10-16 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652617#comment-16652617
 ] 

Kenneth Knowles commented on BEAM-5773:
---

Looks like the same thing as happed on 
[https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/1401/] only in 
this case Gradle could start threads but the Python test framework could not.

{code}
OpenBLAS blas_thread_init: RLIMIT_NPROC 10240 current, 10240 max
Process SyncManager-1:
Traceback (most recent call last):
  File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
  File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
  File "/usr/lib/python2.7/multiprocessing/managers.py", line 558, in 
_run_server
server.serve_forever()
  File "/usr/lib/python2.7/multiprocessing/managers.py", line 184, in 
serve_forever
t.start()
  File "/usr/lib/python2.7/threading.py", line 736, in start
_start_new_thread(self.__bootstrap, ())
error: can't start new thread
interrupted
./scripts/run_postcommit.sh: line 124: 32380 Segmentation fault  (core 
dumped) python setup.py nosetests --attr $1 --nologcapture --processes=8 
--process-timeout=3000 --test-pipeline-options="$JOINED_OPTS" $TESTS
{code}

> Failure in beam_PostCommit_Py_VR_Dataflow "There is insufficient memory for 
> the Java Runtime Environment to continue."
> --
>
> Key: BEAM-5773
> URL: https://issues.apache.org/jira/browse/BEAM-5773
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>
> Jenkins failed on the Python Dataflow ValidatesRunner postcommit because it 
> Gradle allocate a thread.
> [https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/1402/console]
> Likely transient, but filing this to track if that is the case.
>  {code}
> 15:07:52 [src] $ 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/gradlew
>  --info --continue --max-workers=12 -Dorg.gradle.jvmargs=-Xms2g 
> -Dorg.gradle.jvmargs=-Xmx4g :beam-sdks-python:validatesRunnerBatchTests 
> :beam-sdks-python:validatesRunnerStreamingTests
> 15:07:52 #
> 15:07:52 # There is insufficient memory for the Java Runtime Environment to 
> continue.
> 15:07:52 # Cannot create GC thread. Out of system resources.
> 15:07:52 # An error report file with more information is saved as:
> 15:07:52 # 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/hs_err_pid31336.log
> 15:07:53 Build step 'Invoke Gradle script' changed build result to FAILURE
> 15:07:53 Build step 'Invoke Gradle script' marked build as failure
> 15:07:56 Sending e-mails to: comm...@beam.apache.org
> 15:07:57 No emails were triggered.
> 15:07:57 Finished: FAILURE
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5773) Failure in beam_PostCommit_Py_VR_Dataflow "There is insufficient memory for the Java Runtime Environment to continue."

2018-10-16 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-5773:
-

Assignee: Kenneth Knowles

> Failure in beam_PostCommit_Py_VR_Dataflow "There is insufficient memory for 
> the Java Runtime Environment to continue."
> --
>
> Key: BEAM-5773
> URL: https://issues.apache.org/jira/browse/BEAM-5773
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>
> Jenkins failed on the Python Dataflow ValidatesRunner postcommit because it 
> Gradle allocate a thread.
> [https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/1402/console]
> Likely transient, but filing this to track if that is the case.
>  {code}
> 15:07:52 [src] $ 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/gradlew
>  --info --continue --max-workers=12 -Dorg.gradle.jvmargs=-Xms2g 
> -Dorg.gradle.jvmargs=-Xmx4g :beam-sdks-python:validatesRunnerBatchTests 
> :beam-sdks-python:validatesRunnerStreamingTests
> 15:07:52 #
> 15:07:52 # There is insufficient memory for the Java Runtime Environment to 
> continue.
> 15:07:52 # Cannot create GC thread. Out of system resources.
> 15:07:52 # An error report file with more information is saved as:
> 15:07:52 # 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/hs_err_pid31336.log
> 15:07:53 Build step 'Invoke Gradle script' changed build result to FAILURE
> 15:07:53 Build step 'Invoke Gradle script' marked build as failure
> 15:07:56 Sending e-mails to: comm...@beam.apache.org
> 15:07:57 No emails were triggered.
> 15:07:57 Finished: FAILURE
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5773) Failure in beam_PostCommit_Py_VR_Dataflow "There is insufficient memory for the Java Runtime Environment to continue."

2018-10-16 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652615#comment-16652615
 ] 

Kenneth Knowles commented on BEAM-5773:
---

Removed auto-assignee.

> Failure in beam_PostCommit_Py_VR_Dataflow "There is insufficient memory for 
> the Java Runtime Environment to continue."
> --
>
> Key: BEAM-5773
> URL: https://issues.apache.org/jira/browse/BEAM-5773
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Kenneth Knowles
>Priority: Major
>
> Jenkins failed on the Python Dataflow ValidatesRunner postcommit because it 
> Gradle allocate a thread.
> [https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/1402/console]
> Likely transient, but filing this to track if that is the case.
>  {code}
> 15:07:52 [src] $ 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/gradlew
>  --info --continue --max-workers=12 -Dorg.gradle.jvmargs=-Xms2g 
> -Dorg.gradle.jvmargs=-Xmx4g :beam-sdks-python:validatesRunnerBatchTests 
> :beam-sdks-python:validatesRunnerStreamingTests
> 15:07:52 #
> 15:07:52 # There is insufficient memory for the Java Runtime Environment to 
> continue.
> 15:07:52 # Cannot create GC thread. Out of system resources.
> 15:07:52 # An error report file with more information is saved as:
> 15:07:52 # 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/hs_err_pid31336.log
> 15:07:53 Build step 'Invoke Gradle script' changed build result to FAILURE
> 15:07:53 Build step 'Invoke Gradle script' marked build as failure
> 15:07:56 Sending e-mails to: comm...@beam.apache.org
> 15:07:57 No emails were triggered.
> 15:07:57 Finished: FAILURE
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5773) Failure in beam_PostCommit_Py_VR_Dataflow "There is insufficient memory for the Java Runtime Environment to continue."

2018-10-16 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-5773:
-

 Summary: Failure in beam_PostCommit_Py_VR_Dataflow "There is 
insufficient memory for the Java Runtime Environment to continue."
 Key: BEAM-5773
 URL: https://issues.apache.org/jira/browse/BEAM-5773
 Project: Beam
  Issue Type: Bug
  Components: build-system
Reporter: Kenneth Knowles
Assignee: Luke Cwik


Jenkins failed on the Python Dataflow ValidatesRunner postcommit because it 
Gradle allocate a thread.

[https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/1402/console]

Likely transient, but filing this to track if that is the case.

 {code}
15:07:52 [src] $ 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/gradlew
 --info --continue --max-workers=12 -Dorg.gradle.jvmargs=-Xms2g 
-Dorg.gradle.jvmargs=-Xmx4g :beam-sdks-python:validatesRunnerBatchTests 
:beam-sdks-python:validatesRunnerStreamingTests
15:07:52 #
15:07:52 # There is insufficient memory for the Java Runtime Environment to 
continue.
15:07:52 # Cannot create GC thread. Out of system resources.
15:07:52 # An error report file with more information is saved as:
15:07:52 # 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/hs_err_pid31336.log
15:07:53 Build step 'Invoke Gradle script' changed build result to FAILURE
15:07:53 Build step 'Invoke Gradle script' marked build as failure
15:07:56 Sending e-mails to: comm...@beam.apache.org
15:07:57 No emails were triggered.
15:07:57 Finished: FAILURE
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5730) Migrate Java test to use a staged worker jar

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5730?focusedWorklogId=155203&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155203
 ]

ASF GitHub Bot logged work on BEAM-5730:


Author: ASF GitHub Bot
Created on: 16/Oct/18 23:07
Start Date: 16/Oct/18 23:07
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #6694: [BEAM-5730] Migrate 
ITs using DataflowRunner to use custom worker
URL: https://github.com/apache/beam/pull/6694#issuecomment-430432035
 
 
   Run Java PostCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155203)
Time Spent: 0.5h  (was: 20m)

> Migrate Java test to use a staged worker jar
> 
>
> Key: BEAM-5730
> URL: https://issues.apache.org/jira/browse/BEAM-5730
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-dataflow
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5773) Failure in beam_PostCommit_Py_VR_Dataflow "There is insufficient memory for the Java Runtime Environment to continue."

2018-10-16 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-5773:
-

Assignee: (was: Luke Cwik)

> Failure in beam_PostCommit_Py_VR_Dataflow "There is insufficient memory for 
> the Java Runtime Environment to continue."
> --
>
> Key: BEAM-5773
> URL: https://issues.apache.org/jira/browse/BEAM-5773
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Kenneth Knowles
>Priority: Major
>
> Jenkins failed on the Python Dataflow ValidatesRunner postcommit because it 
> Gradle allocate a thread.
> [https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/1402/console]
> Likely transient, but filing this to track if that is the case.
>  {code}
> 15:07:52 [src] $ 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/gradlew
>  --info --continue --max-workers=12 -Dorg.gradle.jvmargs=-Xms2g 
> -Dorg.gradle.jvmargs=-Xmx4g :beam-sdks-python:validatesRunnerBatchTests 
> :beam-sdks-python:validatesRunnerStreamingTests
> 15:07:52 #
> 15:07:52 # There is insufficient memory for the Java Runtime Environment to 
> continue.
> 15:07:52 # Cannot create GC thread. Out of system resources.
> 15:07:52 # An error report file with more information is saved as:
> 15:07:52 # 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/hs_err_pid31336.log
> 15:07:53 Build step 'Invoke Gradle script' changed build result to FAILURE
> 15:07:53 Build step 'Invoke Gradle script' marked build as failure
> 15:07:56 Sending e-mails to: comm...@beam.apache.org
> 15:07:57 No emails were triggered.
> 15:07:57 Finished: FAILURE
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5772) GCP IO tests slow down general Beam PostCommits

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5772?focusedWorklogId=155197&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155197
 ]

ASF GitHub Bot logged work on BEAM-5772:


Author: ASF GitHub Bot
Created on: 16/Oct/18 22:58
Start Date: 16/Oct/18 22:58
Worklog Time Spent: 10m 
  Work Description: pabloem opened a new pull request #6712: [BEAM-5772] 
Moving GCP IO tests to a new post commit suite
URL: https://github.com/apache/beam/pull/6712
 
 
   r: @Ardagan 
   
   Can you help me with these jenkins jobs? : )


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155197)
Time Spent: 10m
Remaining Estimate: 0h

> GCP IO tests slow down general Beam PostCommits
> ---
>
> Key: BEAM-5772
> URL: https://issues.apache.org/jira/browse/BEAM-5772
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp, testing
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5772) GCP IO tests slow down general Beam PostCommits

2018-10-16 Thread Pablo Estrada (JIRA)
Pablo Estrada created BEAM-5772:
---

 Summary: GCP IO tests slow down general Beam PostCommits
 Key: BEAM-5772
 URL: https://issues.apache.org/jira/browse/BEAM-5772
 Project: Beam
  Issue Type: Bug
  Components: io-java-gcp, testing
Reporter: Pablo Estrada
Assignee: Pablo Estrada






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5685) TopWikipediaSessionsIT is flaky

2018-10-16 Thread Pablo Estrada (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pablo Estrada resolved BEAM-5685.
-
   Resolution: Fixed
Fix Version/s: 2.8.0

> TopWikipediaSessionsIT is flaky
> ---
>
> Key: BEAM-5685
> URL: https://issues.apache.org/jira/browse/BEAM-5685
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
> Fix For: 2.8.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5693) Python SDK tests failing on Windows

2018-10-16 Thread Pablo Estrada (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pablo Estrada resolved BEAM-5693.
-
   Resolution: Fixed
Fix Version/s: 2.8.0

> Python SDK tests failing on Windows
> ---
>
> Key: BEAM-5693
> URL: https://issues.apache.org/jira/browse/BEAM-5693
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
> Fix For: 2.8.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5684) Need a test that verifies Flattening / not-flattening of BQ nested records

2018-10-16 Thread Pablo Estrada (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pablo Estrada resolved BEAM-5684.
-
   Resolution: Fixed
Fix Version/s: 2.8.0

> Need a test that verifies Flattening / not-flattening of BQ nested records
> --
>
> Key: BEAM-5684
> URL: https://issues.apache.org/jira/browse/BEAM-5684
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
> Fix For: 2.8.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5407) [beam_PostCommit_Go_GradleBuild][testE2ETopWikiPages][RolledBack] Breaks post commit

2018-10-16 Thread Pablo Estrada (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pablo Estrada resolved BEAM-5407.
-
   Resolution: Fixed
Fix Version/s: 2.8.0

> [beam_PostCommit_Go_GradleBuild][testE2ETopWikiPages][RolledBack] Breaks post 
> commit
> 
>
> Key: BEAM-5407
> URL: https://issues.apache.org/jira/browse/BEAM-5407
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Batkhuyag Batsaikhan
>Assignee: Pablo Estrada
>Priority: Major
> Fix For: 2.8.0
>
>
> Failing job url: 
> https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1482/testReport/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4130) Portable Flink runner JobService entry point in a Docker container

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4130?focusedWorklogId=155188&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155188
 ]

ASF GitHub Bot logged work on BEAM-4130:


Author: ASF GitHub Bot
Created on: 16/Oct/18 22:48
Start Date: 16/Oct/18 22:48
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #6703: [BEAM-4130] Add tests 
for FlinkJobServerDriver
URL: https://github.com/apache/beam/pull/6703#issuecomment-430428033
 
 
   Run Java PreCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155188)
Time Spent: 14h 10m  (was: 14h)

> Portable Flink runner JobService entry point in a Docker container
> --
>
> Key: BEAM-4130
> URL: https://issues.apache.org/jira/browse/BEAM-4130
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Ben Sidhom
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 2.7.0
>
>  Time Spent: 14h 10m
>  Remaining Estimate: 0h
>
> The portable Flink runner exists as a Job Service that runs somewhere. We 
> need a main entry point that itself spins up the job service (and artifact 
> staging service). The main program itself should be packaged into an uberjar 
> such that it can be run locally or submitted to a Flink deployment via `flink 
> run`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5637) Python support for custom dataflow worker jar

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5637?focusedWorklogId=155183&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155183
 ]

ASF GitHub Bot logged work on BEAM-5637:


Author: ASF GitHub Bot
Created on: 16/Oct/18 22:44
Start Date: 16/Oct/18 22:44
Worklog Time Spent: 10m 
  Work Description: aaltay commented on a change in pull request #6680: 
[BEAM-5637] Python support for custom dataflow worker jar
URL: https://github.com/apache/beam/pull/6680#discussion_r225731222
 
 

 ##
 File path: sdks/python/apache_beam/options/pipeline_options.py
 ##
 @@ -520,6 +520,12 @@ def _add_argparse_args(cls, parser):
 type=str,
 help='GCE minimum CPU platform. Default is determined by GCP.'
 )
+parser.add_argument(
 
 Review comment:
   In light of the discussion here on the dev@ list related to runner options 
(https://lists.apache.org/thread.html/78fe33dc41b04886f5355d66d50359265bfa2985580bb70f79c53545@%3Cdev.beam.apache.org%3E).
 Would it be better to expose this as a runner option?
   
   @robertwb 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155183)
Time Spent: 4h 50m  (was: 4h 40m)

> Python support for custom dataflow worker jar
> -
>
> Key: BEAM-5637
> URL: https://issues.apache.org/jira/browse/BEAM-5637
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Henning Rohde
>Assignee: Ruoyun Huang
>Priority: Major
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> One of the slightly subtle aspects is that we would need to ignore one of the 
> staged jars for portable Python jobs. That requires a change to the Python 
> boot code: 
> https://github.com/apache/beam/blob/66d7c865b7267f388ee60752891a9141fad43774/sdks/python/container/boot.go#L104



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5637) Python support for custom dataflow worker jar

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5637?focusedWorklogId=155184&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155184
 ]

ASF GitHub Bot logged work on BEAM-5637:


Author: ASF GitHub Bot
Created on: 16/Oct/18 22:44
Start Date: 16/Oct/18 22:44
Worklog Time Spent: 10m 
  Work Description: aaltay commented on a change in pull request #6680: 
[BEAM-5637] Python support for custom dataflow worker jar
URL: https://github.com/apache/beam/pull/6680#discussion_r225732036
 
 

 ##
 File path: sdks/python/apache_beam/options/pipeline_options.py
 ##
 @@ -520,6 +520,12 @@ def _add_argparse_args(cls, parser):
 type=str,
 help='GCE minimum CPU platform. Default is determined by GCP.'
 )
+parser.add_argument(
+'--dataflow_worker_jar',
+dest='dataflow_worker_jar',
+type=str,
+help='Dataflow worker jar.'
 
 Review comment:
   Could you update the description here. 
   
   We would not expect users to use this option typically. Biggest use case is 
probably development related changes. And it also cannot be used for legacy 
pipelines either. (Should this be an error, if fn api experiment is not set but 
this flag is used?)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155184)
Time Spent: 5h  (was: 4h 50m)

> Python support for custom dataflow worker jar
> -
>
> Key: BEAM-5637
> URL: https://issues.apache.org/jira/browse/BEAM-5637
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Henning Rohde
>Assignee: Ruoyun Huang
>Priority: Major
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> One of the slightly subtle aspects is that we would need to ignore one of the 
> staged jars for portable Python jobs. That requires a change to the Python 
> boot code: 
> https://github.com/apache/beam/blob/66d7c865b7267f388ee60752891a9141fad43774/sdks/python/container/boot.go#L104



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5741) Move "Contact Us" to a top-level link

2018-10-16 Thread Melissa Pashniak (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652569#comment-16652569
 ] 

Melissa Pashniak commented on BEAM-5741:


what do you mean by top-level link here? as in another item in the top items 
(documentation, SDKs, community, etc.)?

 

> Move "Contact Us" to a top-level link
> -
>
> Key: BEAM-5741
> URL: https://issues.apache.org/jira/browse/BEAM-5741
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Scott Wegner
>Priority: Major
>
> It should be very easy to figure out how to get in touch with community. 
> "Contact Us" should be a top-level link on the page.
> The page can also be improved with:
> * Some basic text on how to use subscribe / unsubscribe links
> * Recommendations on how to use various communications channels (Slack for 
> quick questions, dev@ for longer conversations. And all decisions should make 
> it back to dev@)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=155171&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155171
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 16/Oct/18 22:39
Start Date: 16/Oct/18 22:39
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #6637: [BEAM-5707] Add a 
periodic, streaming impulse source for Flink portable pipelines
URL: https://github.com/apache/beam/pull/6637#issuecomment-430425870
 
 
   Very cool. Thanks Micah!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155171)
Time Spent: 5h 10m  (was: 5h)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Aljoscha Krettek
>Priority: Minor
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=155172&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155172
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 16/Oct/18 22:39
Start Date: 16/Oct/18 22:39
Worklog Time Spent: 10m 
  Work Description: pabloem closed pull request #6637: [BEAM-5707] Add a 
periodic, streaming impulse source for Flink portable pipelines
URL: https://github.com/apache/beam/pull/6637
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
index 42b9c1114a7..2b276f404c7 100644
--- 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
+++ 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
@@ -17,6 +17,9 @@
  */
 package org.apache.beam.runners.flink;
 
+import com.fasterxml.jackson.databind.JsonNode;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.auto.service.AutoService;
 import com.google.common.collect.BiMap;
 import com.google.common.collect.HashMultiset;
 import com.google.common.collect.ImmutableMap;
@@ -34,6 +37,7 @@
 import java.util.TreeMap;
 import org.apache.beam.model.pipeline.v1.RunnerApi;
 import org.apache.beam.runners.core.SystemReduceFn;
+import org.apache.beam.runners.core.construction.NativeTransforms;
 import org.apache.beam.runners.core.construction.PTransformTranslation;
 import org.apache.beam.runners.core.construction.RehydratedComponents;
 import org.apache.beam.runners.core.construction.RunnerPCollectionView;
@@ -52,6 +56,7 @@
 import 
org.apache.beam.runners.flink.translation.wrappers.streaming.SingletonKeyedWorkItemCoder;
 import 
org.apache.beam.runners.flink.translation.wrappers.streaming.WindowDoFnOperator;
 import 
org.apache.beam.runners.flink.translation.wrappers.streaming.WorkItemKeySelector;
+import 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.StreamingImpulseSource;
 import org.apache.beam.runners.fnexecution.provisioning.JobInfo;
 import org.apache.beam.runners.fnexecution.wire.WireCoders;
 import org.apache.beam.sdk.coders.ByteArrayCoder;
@@ -156,6 +161,9 @@ public StreamExecutionEnvironment getExecutionEnvironment() 
{
 void translate(String id, RunnerApi.Pipeline pipeline, T t);
   }
 
+  private static final String STREAMING_IMPULSE_TRANSFORM_URN =
+  "flink:transform:streaming_impulse:v1";
+
   private final Map>
   urnToTransformTranslator;
 
@@ -165,6 +173,7 @@ public StreamExecutionEnvironment getExecutionEnvironment() 
{
 translatorMap.put(PTransformTranslation.FLATTEN_TRANSFORM_URN, 
this::translateFlatten);
 translatorMap.put(PTransformTranslation.GROUP_BY_KEY_TRANSFORM_URN, 
this::translateGroupByKey);
 translatorMap.put(PTransformTranslation.IMPULSE_TRANSFORM_URN, 
this::translateImpulse);
+translatorMap.put(STREAMING_IMPULSE_TRANSFORM_URN, 
this::translateStreamingImpulse);
 translatorMap.put(
 PTransformTranslation.ASSIGN_WINDOWS_TRANSFORM_URN, 
this::translateAssignWindows);
 translatorMap.put(ExecutableStage.URN, this::translateExecutableStage);
@@ -403,6 +412,40 @@ private void translateImpulse(
 
context.addDataStream(Iterables.getOnlyElement(pTransform.getOutputsMap().values()),
 source);
   }
 
+  /** Predicate to determine whether a URN is a Flink native transform. */
+  @AutoService(NativeTransforms.IsNativeTransform.class)
+  public static class IsFlinkNativeTransform implements 
NativeTransforms.IsNativeTransform {
+@Override
+public boolean test(RunnerApi.PTransform pTransform) {
+  return STREAMING_IMPULSE_TRANSFORM_URN.equals(
+  PTransformTranslation.urnForTransformOrNull(pTransform));
+}
+  }
+
+  private void translateStreamingImpulse(
+  String id, RunnerApi.Pipeline pipeline, StreamingTranslationContext 
context) {
+RunnerApi.PTransform pTransform = 
pipeline.getComponents().getTransformsOrThrow(id);
+
+ObjectMapper objectMapper = new ObjectMapper();
+
+int intervalMillis;
+int messageCount;
+try {
+  JsonNode config = 
objectMapper.readTree(pTransform.getSpec().getPayload().toByteArray());
+  intervalMillis = config.path("interval_ms").asInt(100);
+  messageCount = config.path("message_count").asInt(0);
+} catch (IOException e) {
+  throw new RuntimeException("Failed to parse configuration for streaming 
impulse", e);
+}
+
+Dat

[jira] [Work logged] (BEAM-5637) Python support for custom dataflow worker jar

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5637?focusedWorklogId=155170&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155170
 ]

ASF GitHub Bot logged work on BEAM-5637:


Author: ASF GitHub Bot
Created on: 16/Oct/18 22:38
Start Date: 16/Oct/18 22:38
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #6680: [BEAM-5637] Python 
support for custom dataflow worker jar
URL: https://github.com/apache/beam/pull/6680#issuecomment-430425707
 
 
   How does this interact with installing the packages in boot.go. Would not 
this 
(https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L104) 
fail?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155170)
Time Spent: 4h 40m  (was: 4.5h)

> Python support for custom dataflow worker jar
> -
>
> Key: BEAM-5637
> URL: https://issues.apache.org/jira/browse/BEAM-5637
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Henning Rohde
>Assignee: Ruoyun Huang
>Priority: Major
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> One of the slightly subtle aspects is that we would need to ignore one of the 
> staged jars for portable Python jobs. That requires a change to the Python 
> boot code: 
> https://github.com/apache/beam/blob/66d7c865b7267f388ee60752891a9141fad43774/sdks/python/container/boot.go#L104



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5627) Several IO tests fail in Python 3 when accessing a temporary file with TypeError: a bytes-like object is required, not 'str'

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5627?focusedWorklogId=155169&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155169
 ]

ASF GitHub Bot logged work on BEAM-5627:


Author: ASF GitHub Bot
Created on: 16/Oct/18 22:37
Start Date: 16/Oct/18 22:37
Worklog Time Spent: 10m 
  Work Description: HuangLED commented on issue #6671: [BEAM-5627] Fix 
sources test for py3.
URL: https://github.com/apache/beam/pull/6671#issuecomment-430425580
 
 
   Thanks. Please hold on this PR. Not ready yet, fails on beamimport internal 
testing.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155169)
Time Spent: 2h 40m  (was: 2.5h)

> Several IO tests fail in Python 3  when accessing a temporary file with  
> TypeError: a bytes-like object is required, not 'str'
> --
>
> Key: BEAM-5627
> URL: https://issues.apache.org/jira/browse/BEAM-5627
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Rakesh Kumar
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> ERROR: test_split_at_fraction_exhaustive 
> (apache_beam.io.source_test_utils_test.SourceTestUtilsTest)
>  --
>  Traceback (most recent call last):
>File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/source_test_utils_test.py",
>  line 120, in test_split_at_fraction_exhaustive
>  source = self._create_source(data)
>File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/source_test_utils_test.py",
>  line 43, in _create_source
>  source = LineSource(self._create_file_with_data(data))
>File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/source_test_utils_test.py",
>  line 35, in _create_file_with_data
>  f.write(line + '\n')
>File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/target/.tox/py3/lib/python3.5/tempfile.py",
>  line 622, in func_wrapper
>  return func(*args, **kwargs)
> TypeError: a bytes-like object is required, not 'str'
> Also similar:
> ==
>  ERROR: test_file_sink_writing 
> (apache_beam.io.filebasedsink_test.TestFileBasedSink)
> --
> Traceback (most recent call last):
>File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/   
>apache_beam/io/filebasedsink_test.py", line 121, in 
> test_file_sink_writing
>   init_token, writer_results = self._common_init(sink)
> File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/   
>apache_beam/io/filebasedsink_test.py", line 103, in _common_init
>   writer1 = sink.open_writer(init_token, '1')
> File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/   
>apache_beam/options/value_provider.py", line 133, in _f
>   return fnc(self, *args, **kwargs)
> File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/   
>apache_beam/io/filebasedsink.py", line 185, in open_writer
> return FileBasedSinkWriter(self, os.path.join(init_result, uid) + suffix)
> File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/   
>apache_beam/io/filebasedsink.py", line 385, in __init__
>   self.temp_handle = self.sink.open(temp_shard_path)
> File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/   
>apache_beam/io/filebasedsink_test.py", line 82, in open
>   file_handle.write('[start]')
>   TypeError: a bytes-like object is required, not 'str'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=155168&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155168
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 16/Oct/18 22:37
Start Date: 16/Oct/18 22:37
Worklog Time Spent: 10m 
  Work Description: mwylde commented on issue #6637: [BEAM-5707] Add a 
periodic, streaming impulse source for Flink portable pipelines
URL: https://github.com/apache/beam/pull/6637#issuecomment-430425385
 
 
   Style checks are passing, should be good to merge. Thanks for the reviews!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155168)
Time Spent: 5h  (was: 4h 50m)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Aljoscha Krettek
>Priority: Minor
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5627) Several IO tests fail in Python 3 when accessing a temporary file with TypeError: a bytes-like object is required, not 'str'

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5627?focusedWorklogId=155165&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155165
 ]

ASF GitHub Bot logged work on BEAM-5627:


Author: ASF GitHub Bot
Created on: 16/Oct/18 22:34
Start Date: 16/Oct/18 22:34
Worklog Time Spent: 10m 
  Work Description: manuzhang commented on issue #6671: [BEAM-5627] Fix 
sources test for py3.
URL: https://github.com/apache/beam/pull/6671#issuecomment-430424774
 
 
   R: @tvalentyn @aaltay 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155165)
Time Spent: 2.5h  (was: 2h 20m)

> Several IO tests fail in Python 3  when accessing a temporary file with  
> TypeError: a bytes-like object is required, not 'str'
> --
>
> Key: BEAM-5627
> URL: https://issues.apache.org/jira/browse/BEAM-5627
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Rakesh Kumar
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> ERROR: test_split_at_fraction_exhaustive 
> (apache_beam.io.source_test_utils_test.SourceTestUtilsTest)
>  --
>  Traceback (most recent call last):
>File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/source_test_utils_test.py",
>  line 120, in test_split_at_fraction_exhaustive
>  source = self._create_source(data)
>File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/source_test_utils_test.py",
>  line 43, in _create_source
>  source = LineSource(self._create_file_with_data(data))
>File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/source_test_utils_test.py",
>  line 35, in _create_file_with_data
>  f.write(line + '\n')
>File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/target/.tox/py3/lib/python3.5/tempfile.py",
>  line 622, in func_wrapper
>  return func(*args, **kwargs)
> TypeError: a bytes-like object is required, not 'str'
> Also similar:
> ==
>  ERROR: test_file_sink_writing 
> (apache_beam.io.filebasedsink_test.TestFileBasedSink)
> --
> Traceback (most recent call last):
>File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/   
>apache_beam/io/filebasedsink_test.py", line 121, in 
> test_file_sink_writing
>   init_token, writer_results = self._common_init(sink)
> File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/   
>apache_beam/io/filebasedsink_test.py", line 103, in _common_init
>   writer1 = sink.open_writer(init_token, '1')
> File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/   
>apache_beam/options/value_provider.py", line 133, in _f
>   return fnc(self, *args, **kwargs)
> File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/   
>apache_beam/io/filebasedsink.py", line 185, in open_writer
> return FileBasedSinkWriter(self, os.path.join(init_result, uid) + suffix)
> File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/   
>apache_beam/io/filebasedsink.py", line 385, in __init__
>   self.temp_handle = self.sink.open(temp_shard_path)
> File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/   
>apache_beam/io/filebasedsink_test.py", line 82, in open
>   file_handle.write('[start]')
>   TypeError: a bytes-like object is required, not 'str'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=155152&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155152
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 16/Oct/18 22:10
Start Date: 16/Oct/18 22:10
Worklog Time Spent: 10m 
  Work Description: mwylde commented on issue #6637: [BEAM-5707] Add a 
periodic, streaming impulse source for Flink portable pipelines
URL: https://github.com/apache/beam/pull/6637#issuecomment-430419035
 
 
   Run Python PreCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155152)
Time Spent: 4h 50m  (was: 4h 40m)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Aljoscha Krettek
>Priority: Minor
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5176) FailOnWarnings behave differently between CLI and Intellij build

2018-10-16 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652531#comment-16652531
 ] 

Kenneth Knowles commented on BEAM-5176:
---

My mistake; I was on a funky branch.

> FailOnWarnings behave differently between CLI and Intellij build 
> -
>
> Key: BEAM-5176
> URL: https://issues.apache.org/jira/browse/BEAM-5176
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Etienne Chauchot
>Assignee: Kenneth Knowles
>Priority: Major
>
>  In command line the build passes but fails on the IDE because of warnings. 
> To make it pass I had to put false in failOnWarnings in ApplyJavaNature



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5637) Python support for custom dataflow worker jar

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5637?focusedWorklogId=155151&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155151
 ]

ASF GitHub Bot logged work on BEAM-5637:


Author: ASF GitHub Bot
Created on: 16/Oct/18 22:06
Start Date: 16/Oct/18 22:06
Worklog Time Spent: 10m 
  Work Description: pabloem closed pull request #6680: [BEAM-5637] Python 
support for custom dataflow worker jar
URL: https://github.com/apache/beam/pull/6680
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/sdks/python/apache_beam/options/pipeline_options.py 
b/sdks/python/apache_beam/options/pipeline_options.py
index a0059dbb381..357c97ea6da 100644
--- a/sdks/python/apache_beam/options/pipeline_options.py
+++ b/sdks/python/apache_beam/options/pipeline_options.py
@@ -520,6 +520,12 @@ def _add_argparse_args(cls, parser):
 type=str,
 help='GCE minimum CPU platform. Default is determined by GCP.'
 )
+parser.add_argument(
+'--dataflow_worker_jar',
+dest='dataflow_worker_jar',
+type=str,
+help='Dataflow worker jar.'
+)
 
   def validate(self, validator):
 errors = []
diff --git a/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py 
b/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
index 1acd3488524..4143f2dbb1d 100644
--- a/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
+++ b/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
@@ -381,6 +381,13 @@ def run_pipeline(self, pipeline):
 self.dataflow_client = apiclient.DataflowApplicationClient(
 pipeline._options)
 
+dataflow_worker_jar = getattr(worker_options, 'dataflow_worker_jar', None)
+if dataflow_worker_jar is not None:
+  experiments = ["use_staged_dataflow_worker_jar"]
+  if debug_options.experiments is not None:
+experiments = list(set(experiments + debug_options.experiments))
+  debug_options.experiments = experiments
+
 # Create the job description and send a request to the service. The result
 # can be None if there is no need to send a request to the service (e.g.
 # template creation). If a request was sent and failed then the call will
diff --git a/sdks/python/apache_beam/runners/portability/stager.py 
b/sdks/python/apache_beam/runners/portability/stager.py
index ef7401ac6aa..cd7e24fce51 100644
--- a/sdks/python/apache_beam/runners/portability/stager.py
+++ b/sdks/python/apache_beam/runners/portability/stager.py
@@ -59,6 +59,7 @@
 from apache_beam.internal import pickler
 from apache_beam.io.filesystems import FileSystems
 from apache_beam.options.pipeline_options import SetupOptions
+from apache_beam.options.pipeline_options import WorkerOptions
 # TODO(angoenka): Remove reference to dataflow internal names
 from apache_beam.runners.dataflow.internal import names
 from apache_beam.utils import processes
@@ -123,8 +124,7 @@ def stage_job_resources(self,
 
 Returns:
   A list of file names (no paths) for the resources staged. All the
-  files
-  are assumed to be staged at staging_location.
+  files are assumed to be staged at staging_location.
 
 Raises:
   RuntimeError: If files specified are not found or error encountered
@@ -256,6 +256,14 @@ def stage_job_resources(self,
 'The file "%s" cannot be found. Its location was specified by '
 'the --sdk_location command-line option.' % sdk_path)
 
+worker_options = options.view_as(WorkerOptions)
+dataflow_worker_jar = getattr(worker_options, 'dataflow_worker_jar', None)
+if dataflow_worker_jar is not None:
+  jar_staged_filename = 'dataflow-worker.jar'
+  staged_path = FileSystems.join(staging_location, jar_staged_filename)
+  self.stage_artifact(dataflow_worker_jar, staged_path)
+  resources.append(jar_staged_filename)
+
 # Delete all temp files created while staging job resources.
 shutil.rmtree(temp_dir)
 retrieval_token = self.commit_manifest()


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155151)
Time Spent: 4.5h  (was: 4h 20m)

> Python support for custom dataflow worker jar
> -
>
> Key: BEAM-5637
> URL: https://issues.apache.org/jira/browse/BEAM-5637
> P

[jira] [Commented] (BEAM-5176) FailOnWarnings behave differently between CLI and Intellij build

2018-10-16 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652520#comment-16652520
 ] 

Kenneth Knowles commented on BEAM-5176:
---

Incidentally this repros on the command line for me suddenly.

> FailOnWarnings behave differently between CLI and Intellij build 
> -
>
> Key: BEAM-5176
> URL: https://issues.apache.org/jira/browse/BEAM-5176
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Etienne Chauchot
>Assignee: Kenneth Knowles
>Priority: Major
>
>  In command line the build passes but fails on the IDE because of warnings. 
> To make it pass I had to put false in failOnWarnings in ApplyJavaNature



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5176) FailOnWarnings behave differently between CLI and Intellij build

2018-10-16 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652517#comment-16652517
 ] 

Kenneth Knowles commented on BEAM-5176:
---

Actually when I do {{./gradlew --debug}} I see the following passed to the 
failing javac command:
 
{code:java}
-Xlint:all -Werror -XepDisableWarningsInGeneratedCode 
-XepExcludedPaths:(.*/)?(build/generated
.*avro-java|build/generated)/.* -Xep:MutableConstantField:OFF -Xlint:-options 
-Xlint:-cast -Xlint:-deprecation -Xlint:-processing -Xlint:-rawtypes -Xlint:
-serial -Xlint:-try -Xlint:-unchecked -Xlint:-varargs{code}

> FailOnWarnings behave differently between CLI and Intellij build 
> -
>
> Key: BEAM-5176
> URL: https://issues.apache.org/jira/browse/BEAM-5176
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Etienne Chauchot
>Assignee: Kenneth Knowles
>Priority: Major
>
>  In command line the build passes but fails on the IDE because of warnings. 
> To make it pass I had to put false in failOnWarnings in ApplyJavaNature



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-5765) Document IntelliJ workflow: Perform a full build

2018-10-16 Thread Scott Wegner (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Wegner closed BEAM-5765.
--
   Resolution: Fixed
 Assignee: Scott Wegner
Fix Version/s: Not applicable

> Document IntelliJ workflow: Perform a full build
> 
>
> Key: BEAM-5765
> URL: https://issues.apache.org/jira/browse/BEAM-5765
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, website
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Major
> Fix For: Not applicable
>
>
> The current IntelliJ documentation is not well organized. The plan is to 
> re-organize it into a set of developer workflows, with very prescriptive 
> steps that are easy to follow and validate that they are still working.
> This task tracks writing documentation for the scenario: "How-to: Perform a 
> full build"
> The proposed set of workflows to document is listed in this notes doc: 
> https://docs.google.com/document/d/18eXrO9IYll4oOnFb53EBhOtIfx-JLOinTWZSIBFkLk4/edit?usp=sharing
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5767) Document IntelliJ workflow: Run a single unit test

2018-10-16 Thread Scott Wegner (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Wegner reassigned BEAM-5767:
--

Assignee: Scott Wegner

> Document IntelliJ workflow: Run a single unit test
> --
>
> Key: BEAM-5767
> URL: https://issues.apache.org/jira/browse/BEAM-5767
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, website
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Major
>
> The current IntelliJ documentation is not well organized. The plan is to 
> re-organize it into a set of developer workflows, with very prescriptive 
> steps that are easy to follow and validate that they are still working.
> This task tracks writing documentation for the scenario: "How-to: Run a 
> single unit test"
> The proposed set of workflows to document is listed in this notes doc: 
> https://docs.google.com/document/d/18eXrO9IYll4oOnFb53EBhOtIfx-JLOinTWZSIBFkLk4/edit?usp=sharing
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-5766) Document IntelliJ workflow: Build and test a single module

2018-10-16 Thread Scott Wegner (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Wegner closed BEAM-5766.
--
   Resolution: Fixed
Fix Version/s: Not applicable

> Document IntelliJ workflow: Build and test a single module
> --
>
> Key: BEAM-5766
> URL: https://issues.apache.org/jira/browse/BEAM-5766
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, website
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Major
> Fix For: Not applicable
>
>
> The current IntelliJ documentation is not well organized. The plan is to 
> re-organize it into a set of developer workflows, with very prescriptive 
> steps that are easy to follow and validate that they are still working.
> This task tracks writing documentation for the scenario: "How-to: Build and 
> test a single module"
> The proposed set of workflows to document is listed in this notes doc: 
> https://docs.google.com/document/d/18eXrO9IYll4oOnFb53EBhOtIfx-JLOinTWZSIBFkLk4/edit?usp=sharing
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5766) Document IntelliJ workflow: Build and test a single module

2018-10-16 Thread Scott Wegner (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Wegner reassigned BEAM-5766:
--

Assignee: Scott Wegner

> Document IntelliJ workflow: Build and test a single module
> --
>
> Key: BEAM-5766
> URL: https://issues.apache.org/jira/browse/BEAM-5766
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, website
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Major
>
> The current IntelliJ documentation is not well organized. The plan is to 
> re-organize it into a set of developer workflows, with very prescriptive 
> steps that are easy to follow and validate that they are still working.
> This task tracks writing documentation for the scenario: "How-to: Build and 
> test a single module"
> The proposed set of workflows to document is listed in this notes doc: 
> https://docs.google.com/document/d/18eXrO9IYll4oOnFb53EBhOtIfx-JLOinTWZSIBFkLk4/edit?usp=sharing
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4663) Implement Cost calculations for Cost-Based Optimization (CBO)

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4663?focusedWorklogId=155114&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155114
 ]

ASF GitHub Bot logged work on BEAM-4663:


Author: ASF GitHub Bot
Created on: 16/Oct/18 20:50
Start Date: 16/Oct/18 20:50
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #6656: [BEAM-4663] [SQL] 
CBO cost calculation
URL: https://github.com/apache/beam/pull/6656#issuecomment-430395733
 
 
   Overriding Calcite's cost functions in Beam SQL isn't going to buy us much 
until we implement `getStatistic` in BeamCalciteTable instead of using 
[UNKNOWN](https://github.com/apache/calcite/blob/d59b639d27da704f00eff616324a2c04aa06f84c/core/src/main/java/org/apache/calcite/schema/Statistics.java#L37).
 Calcite heavily weights RowCount and [it is the only attribute 
considered](https://github.com/apache/calcite/blob/d59b639d27da704f00eff616324a2c04aa06f84c/core/src/main/java/org/apache/calcite/plan/volcano/VolcanoCost.java#L98)
 in the initial sort.
   
   This also drops important internal information in the cost model. The 
builtin 
[Aggregate](https://github.com/apache/calcite/blob/d59b639d27da704f00eff616324a2c04aa06f84c/core/src/main/java/org/apache/calcite/rel/core/Aggregate.java#L317)
 prefers the `$SUM0` operator via the cost model. The builtin 
[Join](https://github.com/apache/calcite/blob/d59b639d27da704f00eff616324a2c04aa06f84c/core/src/main/java/org/apache/calcite/rel/core/Join.java#L196)
 takes into account the join condition via the row count estimate. If we are 
going to do this, we need to extend the builtin cost model rather than 
overriding it to preserve this.
   
   I'm also not convinced that the internal model's assumption that dIo = 0 is 
wrong. (That appears to be the primary difference here.) Outside of Aggregate 
operators that assumption is effectively true in Dataflow. This is an area 
where we should have tests showing that our model produces better plans than 
the default.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155114)
Time Spent: 1h 50m  (was: 1h 40m)

> Implement Cost calculations for Cost-Based Optimization (CBO) 
> --
>
> Key: BEAM-4663
> URL: https://issues.apache.org/jira/browse/BEAM-4663
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-sql
>Reporter: Kai Jiang
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> To support CBO, we should implement methods in each Beam*Rel.java.  
> computeSelfCost(...) as our first step.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5058) Python precommits should run E2E tests

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5058?focusedWorklogId=155113&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155113
 ]

ASF GitHub Bot logged work on BEAM-5058:


Author: ASF GitHub Bot
Created on: 16/Oct/18 20:48
Start Date: 16/Oct/18 20:48
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on issue #6707: [BEAM-5058] Run 
basic ITs in Python Precommit
URL: https://github.com/apache/beam/pull/6707#issuecomment-430395043
 
 
   PreCommit passed. 
   @udim @aaltay Please take a look.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155113)
Time Spent: 20m  (was: 10m)

> Python precommits should run E2E tests
> --
>
> Key: BEAM-5058
> URL: https://issues.apache.org/jira/browse/BEAM-5058
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, testing
>Reporter: Udi Meiri
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> According to [https://beam.apache.org/contribute/testing/] (which I'm working 
> on), end-to-end tests should be run in precommit on each combination of 
> \{batch, streaming}x\{SDK language}x\{supported runner}.
> At least 2 tests need to be added to Python's precommit: wordcount and 
> wordcount_streaming on Dataflow, and possibly on other supported runners 
> (direct runner and new runners plz).
>  These tests should be configured to run from a Gradle sub-project, so that 
> they're run in parallel to the unit tests.
> Example that parallelizes Java precommit integration tests: 
> [https://github.com/apache/beam/pull/5731]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4130) Portable Flink runner JobService entry point in a Docker container

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4130?focusedWorklogId=155112&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155112
 ]

ASF GitHub Bot logged work on BEAM-4130:


Author: ASF GitHub Bot
Created on: 16/Oct/18 20:44
Start Date: 16/Oct/18 20:44
Worklog Time Spent: 10m 
  Work Description: tweise edited a comment on issue #6703: [BEAM-4130] Add 
tests for FlinkJobServerDriver
URL: https://github.com/apache/beam/pull/6703#issuecomment-430379066
 
 
   @aaltay note that we are waiting to merge this PR - it is blocked by 
unrelated Java pre-commit issues. This PR needs to go into the release.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155112)
Time Spent: 14h  (was: 13h 50m)

> Portable Flink runner JobService entry point in a Docker container
> --
>
> Key: BEAM-4130
> URL: https://issues.apache.org/jira/browse/BEAM-4130
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Ben Sidhom
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 2.7.0
>
>  Time Spent: 14h
>  Remaining Estimate: 0h
>
> The portable Flink runner exists as a Job Service that runs somewhere. We 
> need a main entry point that itself spins up the job service (and artifact 
> staging service). The main program itself should be packaged into an uberjar 
> such that it can be run locally or submitted to a Flink deployment via `flink 
> run`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-5384) [SQL] Calcite optimizes away LogicalProject

2018-10-16 Thread Rui Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652385#comment-16652385
 ] 

Rui Wang edited comment on BEAM-5384 at 10/16/18 8:19 PM:
--

I have seen that in BeamSQL, LogicalProject is gone for query "SELECT key, 
COUNT( * ) FROM TABLE GROUP BY key". 


was (Author: amaliujia):
I have seen that in BeamSQL, LogicalProject is gone for query "SELECT key, 
COUNT(*) FROM TABLE GROUP BY key". 

> [SQL] Calcite optimizes away LogicalProject
> ---
>
> Key: BEAM-5384
> URL: https://issues.apache.org/jira/browse/BEAM-5384
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Anton Kedin
>Priority: Major
>
> *From 
> [https://stackoverflow.com/questions/52313324/beam-sql-wont-work-when-using-aggregation-in-statement-cannot-plan-execution]
>  :*
> I have a basic Beam pipeline that reads from GCS, does a Beam SQL transform 
> and writes the results to BigQuery.
> When I don't do any aggregation in my SQL statement it works fine:
> {code:java}
> ..
> PCollection outputStream =
> sqlRows.apply(
> "sql_transform",
> SqlTransform.query("select views from PCOLLECTION"));
> outputStream.setCoder(SCHEMA.getRowCoder());
> ..
> {code}
> However, when I try to aggregate with a sum then it fails (throws a 
> CannotPlanException exception):
> {code:java}
> ..
> PCollection outputStream =
> sqlRows.apply(
> "sql_transform",
> SqlTransform.query("select wikimedia_project, 
> sum(views) from PCOLLECTION group by wikimedia_project"));
> outputStream.setCoder(SCHEMA.getRowCoder());
> ..
> {code}
> Stacktrace:
> {code:java}
> Step #1: 11:47:37,562 0[main] INFO  
> org.apache.beam.runners.dataflow.DataflowRunner - 
> PipelineOptions.filesToStage was not specified. Defaulting to files from the 
> classpath: will stage 117 files. Enable logging at DEBUG level to see which 
> files will be staged.
> Step #1: 11:47:39,845 2283 [main] INFO  
> org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner - SQL:
> Step #1: SELECT `PCOLLECTION`.`wikimedia_project`, SUM(`PCOLLECTION`.`views`)
> Step #1: FROM `beam`.`PCOLLECTION` AS `PCOLLECTION`
> Step #1: GROUP BY `PCOLLECTION`.`wikimedia_project`
> Step #1: 11:47:40,387 2825 [main] INFO  
> org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner - SQLPlan>
> Step #1: LogicalAggregate(group=[{0}], EXPR$1=[SUM($1)])
> Step #1:   BeamIOSourceRel(table=[[beam, PCOLLECTION]])
> Step #1: 
> Step #1: Exception in thread "main" 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException:
>  Node [rel#7:Subset#1.BEAM_LOGICAL.[]] could not be implemented; planner 
> state:
> Step #1: 
> Step #1: Root: rel#7:Subset#1.BEAM_LOGICAL.[]
> Step #1: Original rel:
> Step #1: LogicalAggregate(subset=[rel#7:Subset#1.BEAM_LOGICAL.[]], 
> group=[{0}], EXPR$1=[SUM($1)]): rowcount = 10.0, cumulative cost = 
> {11.375000476837158 rows, 0.0 cpu, 0.0 io}, id = 5
> Step #1:   BeamIOSourceRel(subset=[rel#4:Subset#0.BEAM_LOGICAL.[]], 
> table=[[beam, PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 
> rows, 101.0 cpu, 0.0 io}, id = 2
> Step #1: 
> Step #1: Sets:
> Step #1: Set#0, type: RecordType(VARCHAR wikimedia_project, BIGINT views)
> Step #1:rel#4:Subset#0.BEAM_LOGICAL.[], best=rel#2, importance=0.81
> Step #1:rel#2:BeamIOSourceRel.BEAM_LOGICAL.[](table=[beam, 
> PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io}
> Step #1:rel#10:Subset#0.ENUMERABLE.[], best=rel#9, importance=0.405
> Step #1:
> rel#9:BeamEnumerableConverter.ENUMERABLE.[](input=rel#4:Subset#0.BEAM_LOGICAL.[]),
>  rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, 
> 1.7976931348623157E308 cpu, 1.7976931348623157E308 io}
> Step #1: Set#1, type: RecordType(VARCHAR wikimedia_project, BIGINT EXPR$1)
> Step #1:rel#6:Subset#1.NONE.[], best=null, importance=0.9
> Step #1:
> rel#5:LogicalAggregate.NONE.[](input=rel#4:Subset#0.BEAM_LOGICAL.[],group={0},EXPR$1=SUM($1)),
>  rowcount=10.0, cumulative cost={inf}
> Step #1:rel#7:Subset#1.BEAM_LOGICAL.[], best=null, importance=1.0
> Step #1:
> rel#8:AbstractConverter.BEAM_LOGICAL.[](input=rel#6:Subset#1.NONE.[],convention=BEAM_LOGICAL,sort=[]),
>  rowcount=10.0, cumulative cost={inf}
> Step #1: 
> Step #1: 
> Step #1:at 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplacer.visit(RelSubset.java:448)
> Step #1:at 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.volcano.RelSubset.buildChea

[jira] [Commented] (BEAM-5384) [SQL] Calcite optimizes away LogicalProject

2018-10-16 Thread Rui Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652385#comment-16652385
 ] 

Rui Wang commented on BEAM-5384:


I have seen that in BeamSQL, LogicalProject is gone for query "SELECT key, 
COUNT(*) FROM TABLE GROUP BY key". 

> [SQL] Calcite optimizes away LogicalProject
> ---
>
> Key: BEAM-5384
> URL: https://issues.apache.org/jira/browse/BEAM-5384
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Anton Kedin
>Priority: Major
>
> *From 
> [https://stackoverflow.com/questions/52313324/beam-sql-wont-work-when-using-aggregation-in-statement-cannot-plan-execution]
>  :*
> I have a basic Beam pipeline that reads from GCS, does a Beam SQL transform 
> and writes the results to BigQuery.
> When I don't do any aggregation in my SQL statement it works fine:
> {code:java}
> ..
> PCollection outputStream =
> sqlRows.apply(
> "sql_transform",
> SqlTransform.query("select views from PCOLLECTION"));
> outputStream.setCoder(SCHEMA.getRowCoder());
> ..
> {code}
> However, when I try to aggregate with a sum then it fails (throws a 
> CannotPlanException exception):
> {code:java}
> ..
> PCollection outputStream =
> sqlRows.apply(
> "sql_transform",
> SqlTransform.query("select wikimedia_project, 
> sum(views) from PCOLLECTION group by wikimedia_project"));
> outputStream.setCoder(SCHEMA.getRowCoder());
> ..
> {code}
> Stacktrace:
> {code:java}
> Step #1: 11:47:37,562 0[main] INFO  
> org.apache.beam.runners.dataflow.DataflowRunner - 
> PipelineOptions.filesToStage was not specified. Defaulting to files from the 
> classpath: will stage 117 files. Enable logging at DEBUG level to see which 
> files will be staged.
> Step #1: 11:47:39,845 2283 [main] INFO  
> org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner - SQL:
> Step #1: SELECT `PCOLLECTION`.`wikimedia_project`, SUM(`PCOLLECTION`.`views`)
> Step #1: FROM `beam`.`PCOLLECTION` AS `PCOLLECTION`
> Step #1: GROUP BY `PCOLLECTION`.`wikimedia_project`
> Step #1: 11:47:40,387 2825 [main] INFO  
> org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner - SQLPlan>
> Step #1: LogicalAggregate(group=[{0}], EXPR$1=[SUM($1)])
> Step #1:   BeamIOSourceRel(table=[[beam, PCOLLECTION]])
> Step #1: 
> Step #1: Exception in thread "main" 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException:
>  Node [rel#7:Subset#1.BEAM_LOGICAL.[]] could not be implemented; planner 
> state:
> Step #1: 
> Step #1: Root: rel#7:Subset#1.BEAM_LOGICAL.[]
> Step #1: Original rel:
> Step #1: LogicalAggregate(subset=[rel#7:Subset#1.BEAM_LOGICAL.[]], 
> group=[{0}], EXPR$1=[SUM($1)]): rowcount = 10.0, cumulative cost = 
> {11.375000476837158 rows, 0.0 cpu, 0.0 io}, id = 5
> Step #1:   BeamIOSourceRel(subset=[rel#4:Subset#0.BEAM_LOGICAL.[]], 
> table=[[beam, PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 
> rows, 101.0 cpu, 0.0 io}, id = 2
> Step #1: 
> Step #1: Sets:
> Step #1: Set#0, type: RecordType(VARCHAR wikimedia_project, BIGINT views)
> Step #1:rel#4:Subset#0.BEAM_LOGICAL.[], best=rel#2, importance=0.81
> Step #1:rel#2:BeamIOSourceRel.BEAM_LOGICAL.[](table=[beam, 
> PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io}
> Step #1:rel#10:Subset#0.ENUMERABLE.[], best=rel#9, importance=0.405
> Step #1:
> rel#9:BeamEnumerableConverter.ENUMERABLE.[](input=rel#4:Subset#0.BEAM_LOGICAL.[]),
>  rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, 
> 1.7976931348623157E308 cpu, 1.7976931348623157E308 io}
> Step #1: Set#1, type: RecordType(VARCHAR wikimedia_project, BIGINT EXPR$1)
> Step #1:rel#6:Subset#1.NONE.[], best=null, importance=0.9
> Step #1:
> rel#5:LogicalAggregate.NONE.[](input=rel#4:Subset#0.BEAM_LOGICAL.[],group={0},EXPR$1=SUM($1)),
>  rowcount=10.0, cumulative cost={inf}
> Step #1:rel#7:Subset#1.BEAM_LOGICAL.[], best=null, importance=1.0
> Step #1:
> rel#8:AbstractConverter.BEAM_LOGICAL.[](input=rel#6:Subset#1.NONE.[],convention=BEAM_LOGICAL,sort=[]),
>  rowcount=10.0, cumulative cost={inf}
> Step #1: 
> Step #1: 
> Step #1:at 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplacer.visit(RelSubset.java:448)
> Step #1:at 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.volcano.RelSubset.buildCheapestPlan(RelSubset.java:298)
> Step #1:at 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:666

[jira] [Closed] (BEAM-5763) Re-organize IntelliJ docs into workflow tasks

2018-10-16 Thread Scott Wegner (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Wegner closed BEAM-5763.
--
   Resolution: Fixed
Fix Version/s: Not applicable

The IntelliJ documentation is now organized as as set of task-focused pages: 
https://cwiki.apache.org/confluence/display/BEAM/Using+IntelliJ+IDE

> Re-organize IntelliJ docs into workflow tasks
> -
>
> Key: BEAM-5763
> URL: https://issues.apache.org/jira/browse/BEAM-5763
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, website
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Major
> Fix For: Not applicable
>
>
> The current documentation is not well organized. It mostly focuses on how to 
> get an initial setup working, but doesn't talk about common developer tasks 
> (building from scratch, testing a single module / unit test / integration 
> test, recovering from project corruption).
> I'd like to re-organize the documentation so to make it very prescriptive to 
> follow and easy to validate that it works.
> Current set of proposed "workflows" listed in this doc: 
> https://docs.google.com/document/d/18eXrO9IYll4oOnFb53EBhOtIfx-JLOinTWZSIBFkLk4/edit?usp=sharing
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4130) Portable Flink runner JobService entry point in a Docker container

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4130?focusedWorklogId=155093&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155093
 ]

ASF GitHub Bot logged work on BEAM-4130:


Author: ASF GitHub Bot
Created on: 16/Oct/18 19:59
Start Date: 16/Oct/18 19:59
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #6703: [BEAM-4130] Add tests 
for FlinkJobServerDriver
URL: https://github.com/apache/beam/pull/6703#issuecomment-430379066
 
 
   @aaltay not that we are waiting to merge this PR - it is blocked by 
unrelated Java pre-commit issues.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155093)
Time Spent: 13h 50m  (was: 13h 40m)

> Portable Flink runner JobService entry point in a Docker container
> --
>
> Key: BEAM-4130
> URL: https://issues.apache.org/jira/browse/BEAM-4130
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Ben Sidhom
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 2.7.0
>
>  Time Spent: 13h 50m
>  Remaining Estimate: 0h
>
> The portable Flink runner exists as a Job Service that runs somewhere. We 
> need a main entry point that itself spins up the job service (and artifact 
> staging service). The main program itself should be packaged into an uberjar 
> such that it can be run locally or submitted to a Flink deployment via `flink 
> run`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5759) ConcurrentModificationException on JmsIO checkpoint finalization

2018-10-16 Thread JIRA


[ 
https://issues.apache.org/jira/browse/BEAM-5759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652348#comment-16652348
 ] 

Jean-Baptiste Onofré commented on BEAM-5759:


Thanks for catching. I'm  reviewing the PR.

> ConcurrentModificationException on JmsIO checkpoint finalization
> 
>
> Key: BEAM-5759
> URL: https://issues.apache.org/jira/browse/BEAM-5759
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jms
>Affects Versions: 2.8.0
>Reporter: Andrew Fulton
>Assignee: Andrew Fulton
> Fix For: 2.9.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When reading from a JmsIO source, a ConcurrentModificationException can be 
> thrown when checkpoint finalization occurs under heavy load.
> For example:
> {{jsonPayload: {}}
>  {{  exception: "java.util.ConcurrentModificationException}}
>  {{    at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:903)}}
>  {{    at java.util.ArrayList$Itr.next(ArrayList.java:853)}}
>  {{    at 
> org.apache.beam.sdk.io.jms.JmsCheckpointMark.finalizeCheckpoint(JmsCheckpointMark.java:65)}}
>  {{    at 
> com.google.cloud.dataflow.worker.StreamingModeExecutionContext$1.run(StreamingModeExecutionContext.java:379)}}
>  {{    at 
> com.google.cloud.dataflow.worker.StreamingDataflowWorker$8.run(StreamingDataflowWorker.java:846)}}
>  {{    at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)}}
>  {{    at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)}}
>  {{    at java.lang.Thread.run(Thread.java:745)}}
>  {{"}}
>  {{  job: "2018-09-27_08_55_18-6454085774348718625"   }}
>  {{  logger: "com.google.cloud.dataflow.worker.StreamingDataflowWorker"   }}
>  {{  message: "Source checkpoint finalization failed:"   }}
>  {{  thread: "309"   }}
>  {{  work: ""   }}
>  {{  worker: "test-andrew-092715504-09270855-tkfp-harness-dnmb"   }}
>  
> Looking at the JmsCheckpointMark code, it appears that access to the pending 
> message list is unprotected - thus if a thread calls finalizeCheckpoint while 
> a separate processing thread adds more messages to the checkpoint mark list 
> then an exception will be thrown.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5759) ConcurrentModificationException on JmsIO checkpoint finalization

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5759?focusedWorklogId=155092&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155092
 ]

ASF GitHub Bot logged work on BEAM-5759:


Author: ASF GitHub Bot
Created on: 16/Oct/18 19:52
Start Date: 16/Oct/18 19:52
Worklog Time Spent: 10m 
  Work Description: jbonofre commented on issue #6702: [BEAM-5759] Ensuring 
JmsIO checkpoint state is accessed and modified safely
URL: https://github.com/apache/beam/pull/6702#issuecomment-430376482
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155092)
Time Spent: 0.5h  (was: 20m)

> ConcurrentModificationException on JmsIO checkpoint finalization
> 
>
> Key: BEAM-5759
> URL: https://issues.apache.org/jira/browse/BEAM-5759
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jms
>Affects Versions: 2.8.0
>Reporter: Andrew Fulton
>Assignee: Andrew Fulton
> Fix For: 2.9.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When reading from a JmsIO source, a ConcurrentModificationException can be 
> thrown when checkpoint finalization occurs under heavy load.
> For example:
> {{jsonPayload: {}}
>  {{  exception: "java.util.ConcurrentModificationException}}
>  {{    at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:903)}}
>  {{    at java.util.ArrayList$Itr.next(ArrayList.java:853)}}
>  {{    at 
> org.apache.beam.sdk.io.jms.JmsCheckpointMark.finalizeCheckpoint(JmsCheckpointMark.java:65)}}
>  {{    at 
> com.google.cloud.dataflow.worker.StreamingModeExecutionContext$1.run(StreamingModeExecutionContext.java:379)}}
>  {{    at 
> com.google.cloud.dataflow.worker.StreamingDataflowWorker$8.run(StreamingDataflowWorker.java:846)}}
>  {{    at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)}}
>  {{    at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)}}
>  {{    at java.lang.Thread.run(Thread.java:745)}}
>  {{"}}
>  {{  job: "2018-09-27_08_55_18-6454085774348718625"   }}
>  {{  logger: "com.google.cloud.dataflow.worker.StreamingDataflowWorker"   }}
>  {{  message: "Source checkpoint finalization failed:"   }}
>  {{  thread: "309"   }}
>  {{  work: ""   }}
>  {{  worker: "test-andrew-092715504-09270855-tkfp-harness-dnmb"   }}
>  
> Looking at the JmsCheckpointMark code, it appears that access to the pending 
> message list is unprotected - thus if a thread calls finalizeCheckpoint while 
> a separate processing thread adds more messages to the checkpoint mark list 
> then an exception will be thrown.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-4663) Implement Cost calculations for Cost-Based Optimization (CBO)

2018-10-16 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-4663:
-

Assignee: Kai Jiang

> Implement Cost calculations for Cost-Based Optimization (CBO) 
> --
>
> Key: BEAM-4663
> URL: https://issues.apache.org/jira/browse/BEAM-4663
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-sql
>Reporter: Kai Jiang
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> To support CBO, we should implement methods in each Beam*Rel.java.  
> computeSelfCost(...) as our first step.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4663) Implement Cost calculations for Cost-Based Optimization (CBO)

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4663?focusedWorklogId=155089&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155089
 ]

ASF GitHub Bot logged work on BEAM-4663:


Author: ASF GitHub Bot
Created on: 16/Oct/18 19:49
Start Date: 16/Oct/18 19:49
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on a change in pull request 
#6656: [BEAM-4663] [SQL] CBO cost calculation
URL: https://github.com/apache/beam/pull/6656#discussion_r225684401
 
 

 ##
 File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamAggregationRel.java
 ##
 @@ -111,6 +114,15 @@ public RelWriter explainTerms(RelWriter pw) {
 return pw;
   }
 
+  @Override
+  public RelOptCost computeSelfCost(RelOptPlanner planner, RelMetadataQuery 
metadata) {
+RelNode child = getInput();
+Double rowCnt = metadata.getRowCount(child);
 
 Review comment:
   A `BoundedSource` does have size estimation that we might be able to use. 
What I think is important is the ability to correctly guide Calcite to apply 
the desired rules.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155089)
Time Spent: 1h 40m  (was: 1.5h)

> Implement Cost calculations for Cost-Based Optimization (CBO) 
> --
>
> Key: BEAM-4663
> URL: https://issues.apache.org/jira/browse/BEAM-4663
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-sql
>Reporter: Kai Jiang
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> To support CBO, we should implement methods in each Beam*Rel.java.  
> computeSelfCost(...) as our first step.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5637) Python support for custom dataflow worker jar

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5637?focusedWorklogId=155083&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155083
 ]

ASF GitHub Bot logged work on BEAM-5637:


Author: ASF GitHub Bot
Created on: 16/Oct/18 19:30
Start Date: 16/Oct/18 19:30
Worklog Time Spent: 10m 
  Work Description: HuangLED commented on issue #6680: [BEAM-5637] Python 
support for custom dataflow worker jar
URL: https://github.com/apache/beam/pull/6680#issuecomment-430369135
 
 
   Run Python PostCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155083)
Time Spent: 4h 20m  (was: 4h 10m)

> Python support for custom dataflow worker jar
> -
>
> Key: BEAM-5637
> URL: https://issues.apache.org/jira/browse/BEAM-5637
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Henning Rohde
>Assignee: Ruoyun Huang
>Priority: Major
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> One of the slightly subtle aspects is that we would need to ignore one of the 
> staged jars for portable Python jobs. That requires a change to the Python 
> boot code: 
> https://github.com/apache/beam/blob/66d7c865b7267f388ee60752891a9141fad43774/sdks/python/container/boot.go#L104



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3900) Introduce Euphoria Java 8 DSL

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3900?focusedWorklogId=155081&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155081
 ]

ASF GitHub Bot logged work on BEAM-3900:


Author: ASF GitHub Bot
Created on: 16/Oct/18 19:29
Start Date: 16/Oct/18 19:29
Worklog Time Spent: 10m 
  Work Description: je-ik opened a new pull request #6709: [BEAM-3900] 
docs: TopPerKey is supported by euphoria
URL: https://github.com/apache/beam/pull/6709
 
 
   TopPerKey is supported by Euphoria DSL, it was just a left-over in 
documentation.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155081)
Time Spent: 3.5h  (was: 3h 20m)

> Introduce Euphoria Java 8 DSL
> -
>
> Key: BEAM-3900
> URL: https://issues.apache.org/jira/browse/BEAM-3900
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-euphoria
>Reporter: David Moravek
>Assignee: David Moravek
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> This is the umbrella issue for integrating [Euphoria 
> API|http://github.com/seznam/euphoria] into Beam.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4663) Implement Cost calculations for Cost-Based Optimization (CBO)

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4663?focusedWorklogId=155058&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155058
 ]

ASF GitHub Bot logged work on BEAM-4663:


Author: ASF GitHub Bot
Created on: 16/Oct/18 18:54
Start Date: 16/Oct/18 18:54
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on a change in pull request #6656: 
[BEAM-4663] [SQL] CBO cost calculation
URL: https://github.com/apache/beam/pull/6656#discussion_r22599
 
 

 ##
 File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamAggregationRel.java
 ##
 @@ -111,6 +114,15 @@ public RelWriter explainTerms(RelWriter pw) {
 return pw;
   }
 
+  @Override
+  public RelOptCost computeSelfCost(RelOptPlanner planner, RelMetadataQuery 
metadata) {
+RelNode child = getInput();
+Double rowCnt = metadata.getRowCount(child);
 
 Review comment:
   Because there is no support to get the relatively estimate row count yet, I 
suggest we don't work on CBO but focus on something we can control (e.g. 
logical optimization). For example, we can work on figuring out whether rules 
work in  
[BeamRuleSets.java](https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/planner/BeamRuleSets.java).
 I have seen an issue that we add a rule int the list but that rule triggers a 
bug at a moment. Also some rules definitely are not working because of traits 
setup.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155058)
Time Spent: 1h 20m  (was: 1h 10m)

> Implement Cost calculations for Cost-Based Optimization (CBO) 
> --
>
> Key: BEAM-4663
> URL: https://issues.apache.org/jira/browse/BEAM-4663
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-sql
>Reporter: Kai Jiang
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> To support CBO, we should implement methods in each Beam*Rel.java.  
> computeSelfCost(...) as our first step.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4663) Implement Cost calculations for Cost-Based Optimization (CBO)

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4663?focusedWorklogId=155059&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155059
 ]

ASF GitHub Bot logged work on BEAM-4663:


Author: ASF GitHub Bot
Created on: 16/Oct/18 18:54
Start Date: 16/Oct/18 18:54
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on a change in pull request #6656: 
[BEAM-4663] [SQL] CBO cost calculation
URL: https://github.com/apache/beam/pull/6656#discussion_r22599
 
 

 ##
 File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamAggregationRel.java
 ##
 @@ -111,6 +114,15 @@ public RelWriter explainTerms(RelWriter pw) {
 return pw;
   }
 
+  @Override
+  public RelOptCost computeSelfCost(RelOptPlanner planner, RelMetadataQuery 
metadata) {
+RelNode child = getInput();
+Double rowCnt = metadata.getRowCount(child);
 
 Review comment:
   Because there is no support to get the relatively correct estimate row count 
yet, I suggest we don't work on CBO but focus on something we can control (e.g. 
logical optimization). For example, we can work on figuring out whether rules 
work in  
[BeamRuleSets.java](https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/planner/BeamRuleSets.java).
 I have seen an issue that we add a rule int the list but that rule triggers a 
bug at a moment. Also some rules definitely are not working because of traits 
setup.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155059)
Time Spent: 1.5h  (was: 1h 20m)

> Implement Cost calculations for Cost-Based Optimization (CBO) 
> --
>
> Key: BEAM-4663
> URL: https://issues.apache.org/jira/browse/BEAM-4663
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-sql
>Reporter: Kai Jiang
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> To support CBO, we should implement methods in each Beam*Rel.java.  
> computeSelfCost(...) as our first step.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5058) Python precommits should run E2E tests

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5058?focusedWorklogId=155052&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155052
 ]

ASF GitHub Bot logged work on BEAM-5058:


Author: ASF GitHub Bot
Created on: 16/Oct/18 18:32
Start Date: 16/Oct/18 18:32
Worklog Time Spent: 10m 
  Work Description: markflyhigh opened a new pull request #6707: 
[BEAM-5058] Run basic ITs in Python Precommit
URL: https://github.com/apache/beam/pull/6707
 
 
   According to https://beam.apache.org/contribute/testing/, we want to run 
basic e2e tests in Python Precommit.
   
   This change adds following suite to Python precommit:
- directRunnerIT. 3 integration tests that run with DirectRunner. Finish 
within 1min.
- precommitIT. Including wordcount batch and streaming integration tests 
runs against DataflowRunner. 
   
   I expects this change will increase precommit time ~10mins and overall 
runtime will be 30 - 40 mins based on recent data.
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/)
 | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-

[jira] [Work logged] (BEAM-5114) Create example uber jars for supported runners

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5114?focusedWorklogId=155047&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155047
 ]

ASF GitHub Bot logged work on BEAM-5114:


Author: ASF GitHub Bot
Created on: 16/Oct/18 18:18
Start Date: 16/Oct/18 18:18
Worklog Time Spent: 10m 
  Work Description: stale[bot] commented on issue #6191: [BEAM-5114] Create 
example uber jars
URL: https://github.com/apache/beam/pull/6191#issuecomment-430343037
 
 
   This pull request has been closed due to lack of activity. If you think that 
is incorrect, or the pull request requires review, you can revive the PR at any 
time.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155047)
Time Spent: 2h  (was: 1h 50m)

> Create example uber jars for supported runners
> --
>
> Key: BEAM-5114
> URL: https://issues.apache.org/jira/browse/BEAM-5114
> Project: Beam
>  Issue Type: New Feature
>  Components: examples-java
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Producing these artifacts results in several benefits
>  * Gives an example of how to package user code for different runners
>  * Enables ad-hoc testing of runner changes against real user pipelines easier
>  * Enables integration testing end-to-end pipelines against different runner 
> services



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5114) Create example uber jars for supported runners

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5114?focusedWorklogId=155048&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155048
 ]

ASF GitHub Bot logged work on BEAM-5114:


Author: ASF GitHub Bot
Created on: 16/Oct/18 18:18
Start Date: 16/Oct/18 18:18
Worklog Time Spent: 10m 
  Work Description: stale[bot] closed pull request #6191: [BEAM-5114] 
Create example uber jars
URL: https://github.com/apache/beam/pull/6191
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/examples/java/direct/build.gradle 
b/examples/java/direct/build.gradle
new file mode 100644
index 000..751b2f35457
--- /dev/null
+++ b/examples/java/direct/build.gradle
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import groovy.json.JsonOutput
+
+apply plugin: org.apache.beam.gradle.BeamModulePlugin
+// Disable default shadow jar closure and include all class files and 
resources.
+applyJavaNature(shadowClosure: {})
+
+dependencies {
+compile project(path: ":beam-examples-java", configuration: "shadow")
+compile project(path: ":beam-examples-java", configuration: 
"directRunnerPreCommit")
+}
diff --git a/examples/java/flink/build.gradle b/examples/java/flink/build.gradle
new file mode 100644
index 000..c0674f4d48a
--- /dev/null
+++ b/examples/java/flink/build.gradle
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import groovy.json.JsonOutput
+
+apply plugin: org.apache.beam.gradle.BeamModulePlugin
+// Disable default shadow jar closure and include all class files and 
resources.
+applyJavaNature(shadowClosure: {})
+
+dependencies {
+compile project(path: ":beam-examples-java", configuration: "shadow")
+compile project(path: ":beam-examples-java", configuration: 
"flinkRunnerPreCommit")
+}
diff --git a/examples/java/portable/build.gradle 
b/examples/java/portable/build.gradle
new file mode 100644
index 000..8e342feab22
--- /dev/null
+++ b/examples/java/portable/build.gradle
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import groovy.json.JsonOutput
+
+apply plugin: org.apache.beam.gradle.BeamModulePlugin
+// Disable default shadow jar closure and include all class files and 
resources.
+applyJavaNature(shadowClosure: {})
+
+dependencies {
+compile project(path: ":beam-examples-java", configuration: "shadow")
+

[jira] [Commented] (BEAM-5315) Finish Python 3 porting for io module

2018-10-16 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652186#comment-16652186
 ] 

Valentyn Tymofieiev commented on BEAM-5315:
---

Hi [~udim], thanks for upgrading the Datastore dependency. You could:

1. Make sure existing ~35 datastore unit tests pass on Python 2.
2. Modify setup.py to remove the restriction that prevents installation of 
datastore client library on Python 3: 
https://github.com/apache/beam/blob/3e7e0346492f9c70903590c50133f9f5a5acf9ee/sdks/python/setup.py#L143.
3. Run Datastore unit tests in Python 3. You can follow 
https://s.apache.org/beam-py3-conversion-quick-start for instructions. Chances 
are  some tests may still be failing for other reasons, since not all IO tests 
currently work in Python 3.

Eventually, we should run all tests in IO module as part of Python 3 presubmit 
suite, currently we only run a subset: 
https://github.com/apache/beam/blob/3e7e0346492f9c70903590c50133f9f5a5acf9ee/sdks/python/tox.ini#L61.

> Finish Python 3 porting for io module
> -
>
> Key: BEAM-5315
> URL: https://issues.apache.org/jira/browse/BEAM-5315
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Simon
>Priority: Major
>  Time Spent: 4h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4130) Portable Flink runner JobService entry point in a Docker container

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4130?focusedWorklogId=155040&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155040
 ]

ASF GitHub Bot logged work on BEAM-4130:


Author: ASF GitHub Bot
Created on: 16/Oct/18 17:56
Start Date: 16/Oct/18 17:56
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #6703: [BEAM-4130] Add tests 
for FlinkJobServerDriver
URL: https://github.com/apache/beam/pull/6703#issuecomment-430335378
 
 
   Run Java PreCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155040)
Time Spent: 13h 40m  (was: 13.5h)

> Portable Flink runner JobService entry point in a Docker container
> --
>
> Key: BEAM-4130
> URL: https://issues.apache.org/jira/browse/BEAM-4130
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Ben Sidhom
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 2.7.0
>
>  Time Spent: 13h 40m
>  Remaining Estimate: 0h
>
> The portable Flink runner exists as a Job Service that runs somewhere. We 
> need a main entry point that itself spins up the job service (and artifact 
> staging service). The main program itself should be packaged into an uberjar 
> such that it can be run locally or submitted to a Flink deployment via `flink 
> run`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-4492) Update Python bigquery library to latest version

2018-10-16 Thread Udi Meiri (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri resolved BEAM-4492.
-
   Resolution: Duplicate
Fix Version/s: Not applicable

I'm taking over upgrading bigquery client. See PR in linked issue.

> Update Python bigquery library to latest version
> 
>
> Key: BEAM-4492
> URL: https://issues.apache.org/jira/browse/BEAM-4492
> Project: Beam
>  Issue Type: Task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Charles Chen
>Priority: Major
> Fix For: Not applicable
>
>
> Current google-cloud-bigquery is set to 0.25.0 in 
> https://github.com/apache/beam/blob/master/sdks/python/setup.py#L130
> However, the latest version is 1.2.0.
> According to comment in setup.py, this library is only used for testing, so 
> it should be easy to update. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=155035&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155035
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 16/Oct/18 17:54
Start Date: 16/Oct/18 17:54
Worklog Time Spent: 10m 
  Work Description: mxm commented on a change in pull request #6592: 
[BEAM-4176] Enable Post Commit JAVA PVR tests for Flink
URL: https://github.com/apache/beam/pull/6592#discussion_r225645564
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/functions/ReferenceCountingFlinkExecutableStageContextFactory.java
 ##
 @@ -115,6 +115,7 @@ private void scheduleRelease(JobInfo jobInfo) {
 int environmentCacheTTLMillis =
 
pipelineOptions.as(PortablePipelineOptions.class).getEnvironmentCacheMillis();
 if (environmentCacheTTLMillis > 0) {
+  // Do immediate cleanup if this class is not loaded on Flink parent 
classloader.
   if (this.getClass().getClassLoader() != 
ExecutionEnvironment.class.getClassLoader()) {
 
 Review comment:
   *Flink* classes (org.apache.flink.*) are always loaded through the parent 
classloader. The loaded class are always tight to the classloader which was 
used to load them.
   
   All other classes are first loaded through the child classloader, then the 
parent by default.
   
   ```yaml
   classloader.resolve-order: Whether Flink should use a child-first 
ClassLoader when loading user-code classes or a parent-first ClassLoader. Can 
be one of parent-first or child-first. (default: child-first)
   ```
   
https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/config.html#common-options


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155035)
Time Spent: 36.5h  (was: 36h 20m)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Assignee: Ankur Goenka
>Priority: Major
> Attachments: 81VxNWtFtke.png, Screen Shot 2018-08-14 at 4.18.31 
> PM.png, Screen Shot 2018-09-03 at 11.07.38 AM.png
>
>  Time Spent: 36.5h
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=155037&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155037
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 16/Oct/18 17:54
Start Date: 16/Oct/18 17:54
Worklog Time Spent: 10m 
  Work Description: mxm commented on a change in pull request #6592: 
[BEAM-4176] Enable Post Commit JAVA PVR tests for Flink
URL: https://github.com/apache/beam/pull/6592#discussion_r225645778
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/functions/ReferenceCountingFlinkExecutableStageContextFactory.java
 ##
 @@ -115,6 +115,7 @@ private void scheduleRelease(JobInfo jobInfo) {
 int environmentCacheTTLMillis =
 
pipelineOptions.as(PortablePipelineOptions.class).getEnvironmentCacheMillis();
 if (environmentCacheTTLMillis > 0) {
+  // Do immediate cleanup if this class is not loaded on Flink parent 
classloader.
   if (this.getClass().getClassLoader() != 
ExecutionEnvironment.class.getClassLoader()) {
 
 Review comment:
   `FlinkUserCodeClassLoader` has been removed and the new ones, 
`ChildFirstClassLoader` and `ParentFirstClassloader` are package-private. So 
not really an option anyore to check for them.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155037)
Time Spent: 36h 40m  (was: 36.5h)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Assignee: Ankur Goenka
>Priority: Major
> Attachments: 81VxNWtFtke.png, Screen Shot 2018-08-14 at 4.18.31 
> PM.png, Screen Shot 2018-09-03 at 11.07.38 AM.png
>
>  Time Spent: 36h 40m
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-1251) Python 3 Support

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-1251?focusedWorklogId=155034&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155034
 ]

ASF GitHub Bot logged work on BEAM-1251:


Author: ASF GitHub Bot
Created on: 16/Oct/18 17:51
Start Date: 16/Oct/18 17:51
Worklog Time Spent: 10m 
  Work Description: swegner commented on issue #6679: [BEAM-1251] Add a 
link to Python 3 Conversion Quick Start Guide to the list of ongoing efforts on 
Beam site.
URL: https://github.com/apache/beam/pull/6679#issuecomment-430333767
 
 
   @tvalentyn I had looked into the possibility of multiple templates. However 
it appears that the workflow for selecting a pull request requires using a 
special pull request URL with query parameters 
([docs](https://help.github.com/articles/about-automation-for-issues-and-pull-requests-with-query-parameters/)).
 I don't like the idea of requiring contributors to mess with URLs to open a PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155034)
Time Spent: 23.5h  (was: 23h 20m)

> Python 3 Support
> 
>
> Key: BEAM-1251
> URL: https://issues.apache.org/jira/browse/BEAM-1251
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Eyad Sibai
>Assignee: Robbe
>Priority: Major
>  Time Spent: 23.5h
>  Remaining Estimate: 0h
>
> I have been trying to use google datalab with python3. As I see there are 
> several packages that does not support python3 yet which google datalab 
> depends on. This is one of them.
> https://github.com/GoogleCloudPlatform/DataflowPythonSDK/issues/6



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=155033&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155033
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 16/Oct/18 17:45
Start Date: 16/Oct/18 17:45
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #6592: [BEAM-4176] Enable 
Post Commit JAVA PVR tests for Flink
URL: https://github.com/apache/beam/pull/6592#issuecomment-430331548
 
 
   Run Seed Job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155033)
Time Spent: 36h 20m  (was: 36h 10m)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Assignee: Ankur Goenka
>Priority: Major
> Attachments: 81VxNWtFtke.png, Screen Shot 2018-08-14 at 4.18.31 
> PM.png, Screen Shot 2018-09-03 at 11.07.38 AM.png
>
>  Time Spent: 36h 20m
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=155032&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155032
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 16/Oct/18 17:44
Start Date: 16/Oct/18 17:44
Worklog Time Spent: 10m 
  Work Description: tweise closed pull request #6592: [BEAM-4176] Enable 
Post Commit JAVA PVR tests for Flink
URL: https://github.com/apache/beam/pull/6592
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/.test-infra/jenkins/job_PostCommit_Java_PortableValidatesRunner_Flink.groovy 
b/.test-infra/jenkins/job_PostCommit_Java_PortableValidatesRunner_Flink.groovy
new file mode 100644
index 000..ad09a0ab53d
--- /dev/null
+++ 
b/.test-infra/jenkins/job_PostCommit_Java_PortableValidatesRunner_Flink.groovy
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import CommonJobProperties as commonJobProperties
+import PostcommitJobBuilder
+
+// This job runs the suite of ValidatesRunner tests against the Flink runner.
+PostcommitJobBuilder.postCommitJob('beam_PostCommit_Java_PVR_Flink',
+  'Run Java Flink PortableValidatesRunner', 'Java Flink 
PortableValidatesRunner Tests', this) {
+  description('Runs the Java PortableValidatesRunner suite on the Flink 
runner.')
+
+  // Set common parameters.
+  commonJobProperties.setTopLevelMainJobProperties(delegate)
+
+  // Publish all test results to Jenkins
+  publishers {
+archiveJunit('**/build/test-results/**/*.xml')
+  }
+
+  // Gradle goals for this job.
+  steps {
+gradle {
+  rootBuildScriptDir(commonJobProperties.checkoutDir)
+  tasks(':beam-runners-flink_2.11-job-server:validatesPortableRunner')
+  commonJobProperties.setGradleSwitches(delegate)
+}
+  }
+}
diff --git 
a/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy 
b/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy
index d1a80dc5e29..28548c09ddd 100644
--- a/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy
+++ b/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy
@@ -1499,6 +1499,7 @@ artifactId=${project.name}
 testClassesDirs = 
project.files(project.project(":beam-sdks-java-core").sourceSets.test.output.classesDirs,
 project.project(":beam-runners-core-java").sourceSets.test.output.classesDirs)
 maxParallelForks config.parallelism
 useJUnit(config.testCategories)
+dependsOn ':beam-sdks-java-container:docker'
   }
 }
   }
diff --git 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkJobServerDriver.java
 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkJobServerDriver.java
index 679c7cc4bb9..93dc6f0121c 100644
--- 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkJobServerDriver.java
+++ 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkJobServerDriver.java
@@ -170,6 +170,11 @@ public void run() {
 }
   }
 
+  public String start() throws IOException {
+jobServer = createJobServer();
+return jobServer.getApiServiceDescriptor().getUrl();
+  }
+
   public void stop() {
 if (jobServer != null) {
   try {
diff --git 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/functions/ReferenceCountingFlinkExecutableStageContextFactory.java
 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/functions/ReferenceCountingFlinkExecutableStageContextFactory.java
index bb2b9dcbe16..90d291ea28a 100644
--- 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/functions/ReferenceCountingFlinkExecutableStageContextFactory.java
+++ 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/functions/ReferenceCountingFlinkExecutableStageContextFactory.java
@@ -115,6 +115,7 @@ private void scheduleRelease(Job

[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=155030&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155030
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 16/Oct/18 17:42
Start Date: 16/Oct/18 17:42
Worklog Time Spent: 10m 
  Work Description: tweise commented on a change in pull request #6592: 
[BEAM-4176] Enable Post Commit JAVA PVR tests for Flink
URL: https://github.com/apache/beam/pull/6592#discussion_r225641625
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/functions/ReferenceCountingFlinkExecutableStageContextFactory.java
 ##
 @@ -115,6 +115,7 @@ private void scheduleRelease(JobInfo jobInfo) {
 int environmentCacheTTLMillis =
 
pipelineOptions.as(PortablePipelineOptions.class).getEnvironmentCacheMillis();
 if (environmentCacheTTLMillis > 0) {
+  // Do immediate cleanup if this class is not loaded on Flink parent 
classloader.
   if (this.getClass().getClassLoader() != 
ExecutionEnvironment.class.getClassLoader()) {
 
 Review comment:
   Let's continue investigating this as a follow-up, since it isn't directly 
linked to this PR. Merging..


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155030)
Time Spent: 36h  (was: 35h 50m)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Assignee: Ankur Goenka
>Priority: Major
> Attachments: 81VxNWtFtke.png, Screen Shot 2018-08-14 at 4.18.31 
> PM.png, Screen Shot 2018-09-03 at 11.07.38 AM.png
>
>  Time Spent: 36h
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3746) Count.globally should override getIncompatibleGlobalWindowErrorMessage to tell the user the usage that is currently only in javadoc

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3746?focusedWorklogId=155022&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155022
 ]

ASF GitHub Bot logged work on BEAM-3746:


Author: ASF GitHub Bot
Created on: 16/Oct/18 17:35
Start Date: 16/Oct/18 17:35
Worklog Time Spent: 10m 
  Work Description: kennknowles closed pull request #6632: [BEAM-3746] 
Change incompatible message from referencing the output collection to 
referencing the input collection
URL: https://github.com/apache/beam/pull/6632
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/CombineFnBase.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/CombineFnBase.java
index 3756f1fd42d..d2cfaecad0c 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/CombineFnBase.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/CombineFnBase.java
@@ -107,7 +107,7 @@
   abstract static class AbstractGlobalCombineFn
   implements GlobalCombineFn, Serializable {
 private static final String INCOMPATIBLE_GLOBAL_WINDOW_ERROR_MESSAGE =
-"Default values are not supported in Combine.globally() if the output "
+"Default values are not supported in Combine.globally() if the input "
 + "PCollection is not windowed by GlobalWindows. Instead, use "
 + "Combine.globally().withoutDefaults() to output an empty 
PCollection if the input "
 + "PCollection is empty, or Combine.globally().asSingletonView() 
to get the default "
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Top.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Top.java
index 59e569e09a6..354bcb139a3 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Top.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Top.java
@@ -414,10 +414,10 @@ public void populateDisplayData(DisplayData.Builder 
builder) {
 
 @Override
 public String getIncompatibleGlobalWindowErrorMessage() {
-  return "Default values are not supported in Top.[of, smallest, 
largest]() if the output "
+  return "Default values are not supported in Top.[of, smallest, 
largest]() if the input "
   + "PCollection is not windowed by GlobalWindows. Instead, use "
-  + "Top.[of, smallest, largest]().withoutDefaults() to output an 
empty PCollection if the"
-  + " input PCollection is empty, or Top.[of, smallest, 
largest]().asSingletonView() to "
+  + "Top.[of, smallest, largest]().withoutDefaults() to output an 
empty PCollection if the "
+  + "input PCollection is empty, or Top.[of, smallest, 
largest]().asSingletonView() to "
   + "get a PCollection containing the empty list if the input 
PCollection is empty.";
 }
   }


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155022)
Time Spent: 3h  (was: 2h 50m)

> Count.globally should override getIncompatibleGlobalWindowErrorMessage to 
> tell the user the usage that is currently only in javadoc
> ---
>
> Key: BEAM-3746
> URL: https://issues.apache.org/jira/browse/BEAM-3746
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Sam Rohde
>Priority: Major
>  Labels: beginner, newbie, starter
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> https://beam.apache.org/documentation/sdks/javadoc/2.3.0/org/apache/beam/sdk/transforms/Count.html#globally--
> "Note: if the input collection uses a windowing strategy other than 
> GlobalWindows, use Combine.globally(Count.combineFn()).withoutDefaults() 
> instead."
> But the actual crash a user gets is:
> "java.lang.IllegalStateException: Default values are not supported in 
> Combine.globally() if the output PCollection is not windowed by 
> GlobalWindows. Instead, use Combine.globally().withoutDefaults() to output an 
> empty PCollection if the input PCollection is empty, or 
> Combine.globally().asSingletonView() to get the default output of the 
> CombineFn if t

[jira] [Work logged] (BEAM-1251) Python 3 Support

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-1251?focusedWorklogId=155020&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155020
 ]

ASF GitHub Bot logged work on BEAM-1251:


Author: ASF GitHub Bot
Created on: 16/Oct/18 17:32
Start Date: 16/Oct/18 17:32
Worklog Time Spent: 10m 
  Work Description: aaltay closed pull request #6679: [BEAM-1251] Add a 
link to Python 3 Conversion Quick Start Guide to the list of ongoing efforts on 
Beam site.
URL: https://github.com/apache/beam/pull/6679
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/website/src/contribute/index.md b/website/src/contribute/index.md
index 230c3df2ae1..9414a7e748f 100644
--- a/website/src/contribute/index.md
+++ b/website/src/contribute/index.md
@@ -328,8 +328,9 @@ Work is in progress to add Python 3 support to Beam.  
Current goal is to make Be
 
  - 
[Proposal](https://docs.google.com/document/d/1xDG0MWVlDKDPu_IW9gtMvxi2S9I0GB0VDTkPhjXT0nE)
  - [Kanban 
Board](https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=245&view=detail)
+ - [Python 3 Conversion Quick Start 
Guide](https://docs.google.com/document/d/1s1BJVCY65LB_SYK1SU1u7NbZiFANoq-nEYaEvzRbYlA)
 
-Contributions are welcome! If you are interested to help, you can select a 
subpackage to port and assign yourself the corresponding issue. Comment on the 
issue if you cannot assign it yourself.
+Contributions are welcome! If you are interested to help, you can select an 
unassigned issue in the Kanban board and assign it to yourself. Comment on the 
issue if you cannot assign it yourself.
 When submitting a new PR, please tag 
[@RobbeSneyders](https://github.com/robbesneyders), 
[@aaltay](https://github.com/aaltay), and 
[@tvalentyn](https://github.com/tvalentyn).
 
 ### Next Java LTS version support (Java 11 / 18.9)


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155020)
Time Spent: 23h 20m  (was: 23h 10m)

> Python 3 Support
> 
>
> Key: BEAM-1251
> URL: https://issues.apache.org/jira/browse/BEAM-1251
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Eyad Sibai
>Assignee: Robbe
>Priority: Major
>  Time Spent: 23h 20m
>  Remaining Estimate: 0h
>
> I have been trying to use google datalab with python3. As I see there are 
> several packages that does not support python3 yet which google datalab 
> depends on. This is one of them.
> https://github.com/GoogleCloudPlatform/DataflowPythonSDK/issues/6



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-10-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=155016&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-155016
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 16/Oct/18 17:24
Start Date: 16/Oct/18 17:24
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #6592: 
[BEAM-4176] Enable Post Commit JAVA PVR tests for Flink
URL: https://github.com/apache/beam/pull/6592#discussion_r225635682
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/functions/ReferenceCountingFlinkExecutableStageContextFactory.java
 ##
 @@ -115,6 +115,7 @@ private void scheduleRelease(JobInfo jobInfo) {
 int environmentCacheTTLMillis =
 
pipelineOptions.as(PortablePipelineOptions.class).getEnvironmentCacheMillis();
 if (environmentCacheTTLMillis > 0) {
+  // Do immediate cleanup if this class is not loaded on Flink parent 
classloader.
   if (this.getClass().getClassLoader() != 
ExecutionEnvironment.class.getClassLoader()) {
 
 Review comment:
   FlinkUserCodeClassloader seems to have been removed since Flink 1.1 so can't 
use it.
   This PR says that flink now loads all flink classes to the parent class 
loader https://github.com/apache/flink/pull/4891


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 155016)
Time Spent: 35h 50m  (was: 35h 40m)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Assignee: Ankur Goenka
>Priority: Major
> Attachments: 81VxNWtFtke.png, Screen Shot 2018-08-14 at 4.18.31 
> PM.png, Screen Shot 2018-09-03 at 11.07.38 AM.png
>
>  Time Spent: 35h 50m
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   >