Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Spark #3945

2018-01-19 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PerformanceTests_TFRecordIOIT #36

2018-01-19 Thread Apache Jenkins Server
See 


Changes:

[lcwik] [BEAM-3502] Remove usage of proto.Builder.clone() in DatastoreIO (#4449)

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam4 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 3c4a2001e5364657f6596bfc338f792e4797712d (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 3c4a2001e5364657f6596bfc338f792e4797712d
Commit message: "[BEAM-3502] Remove usage of proto.Builder.clone() in 
DatastoreIO (#4449)"
 > git rev-list 898277297be7160900fdb48ed486a257a6bc92c0 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
java.util.concurrent.TimeoutException: Timeout waiting for task.
at 
com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:259)
at 
com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:91)
at 
com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:69)
at 
com.atlassian.jira.rest.client.internal.async.DelegatingPromise.get(DelegatingPromise.java:107)
at 
hudson.plugins.jira.JiraRestService.getIssuesFromJqlSearch(JiraRestService.java:177)
at 
hudson.plugins.jira.JiraSession.getIssuesFromJqlSearch(JiraSession.java:135)
at 
io.jenkins.blueocean.service.embedded.jira.JiraSCMListener.onChangeLogParsed(JiraSCMListener.java:43)
at 
hudson.model.listeners.SCMListener.onChangeLogParsed(SCMListener.java:120)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:590)
Caused: java.io.IOException: Failed to parse changelog
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:592)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499)
at hudson.model.Run.execute(Run.java:1724)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:421)
Retrying after 10 seconds
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 3c4a2001e5364657f6596bfc338f792e4797712d (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 3c4a2001e5364657f6596bfc338f792e4797712d
Commit message: "[BEAM-3502] Remove usage of proto.Builder.clone() in 
DatastoreIO (#4449)"
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
java.util.concurrent.TimeoutException: Timeout waiting for task.
at 
com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:259)
at 
com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:91)
at 
com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:69)
at 
com.atlassian.jira.rest.client.internal.async.DelegatingPromise.get(DelegatingPromise.java:107)
at 
hudson.plugins.jira.JiraRestService.getIssuesFromJqlSearch(JiraRestService.java:177)
at 
hudson.plugins.jira.JiraSession.getIssuesFromJqlSearch(JiraSession.java:135)
at 
io.jenkins.blueocean.service.embedded.jira.JiraSCMListener.onChangeLogParsed(JiraSCMListener.java:43)
at 
hudson.model.listeners.SCMListener.onChangeLogParsed(SCMListener.java:120)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:590)
Caused: java.io.IOException: Failed to parse changelog
at 

Build failed in Jenkins: beam_PerformanceTests_Spark #1256

2018-01-19 Thread Apache Jenkins Server
See 


Changes:

[lcwik] [BEAM-3502] Remove usage of proto.Builder.clone() in DatastoreIO (#4449)

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam2 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 3c4a2001e5364657f6596bfc338f792e4797712d (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 3c4a2001e5364657f6596bfc338f792e4797712d
Commit message: "[BEAM-3502] Remove usage of proto.Builder.clone() in 
DatastoreIO (#4449)"
 > git rev-list 898277297be7160900fdb48ed486a257a6bc92c0 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
java.util.concurrent.TimeoutException: Timeout waiting for task.
at 
com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:259)
at 
com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:91)
at 
com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:69)
at 
com.atlassian.jira.rest.client.internal.async.DelegatingPromise.get(DelegatingPromise.java:107)
at 
hudson.plugins.jira.JiraRestService.getIssuesFromJqlSearch(JiraRestService.java:177)
at 
hudson.plugins.jira.JiraSession.getIssuesFromJqlSearch(JiraSession.java:135)
at 
io.jenkins.blueocean.service.embedded.jira.JiraSCMListener.onChangeLogParsed(JiraSCMListener.java:43)
at 
hudson.model.listeners.SCMListener.onChangeLogParsed(SCMListener.java:120)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:590)
Caused: java.io.IOException: Failed to parse changelog
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:592)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499)
at hudson.model.Run.execute(Run.java:1724)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:421)
Retrying after 10 seconds
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 3c4a2001e5364657f6596bfc338f792e4797712d (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 3c4a2001e5364657f6596bfc338f792e4797712d
Commit message: "[BEAM-3502] Remove usage of proto.Builder.clone() in 
DatastoreIO (#4449)"
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
java.util.concurrent.TimeoutException: Timeout waiting for task.
at 
com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:259)
at 
com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:91)
at 
com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:69)
at 
com.atlassian.jira.rest.client.internal.async.DelegatingPromise.get(DelegatingPromise.java:107)
at 
hudson.plugins.jira.JiraRestService.getIssuesFromJqlSearch(JiraRestService.java:177)
at 
hudson.plugins.jira.JiraSession.getIssuesFromJqlSearch(JiraSession.java:135)
at 
io.jenkins.blueocean.service.embedded.jira.JiraSCMListener.onChangeLogParsed(JiraSCMListener.java:43)
at 
hudson.model.listeners.SCMListener.onChangeLogParsed(SCMListener.java:120)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:590)
Caused: java.io.IOException: Failed to parse changelog
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:592)
  

Build failed in Jenkins: beam_PerformanceTests_Python #812

2018-01-19 Thread Apache Jenkins Server
See 


Changes:

[lcwik] [BEAM-3502] Remove usage of proto.Builder.clone() in DatastoreIO (#4449)

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam7 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 3c4a2001e5364657f6596bfc338f792e4797712d (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 3c4a2001e5364657f6596bfc338f792e4797712d
Commit message: "[BEAM-3502] Remove usage of proto.Builder.clone() in 
DatastoreIO (#4449)"
 > git rev-list 898277297be7160900fdb48ed486a257a6bc92c0 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
java.util.concurrent.TimeoutException: Timeout waiting for task.
at 
com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:259)
at 
com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:91)
at 
com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:69)
at 
com.atlassian.jira.rest.client.internal.async.DelegatingPromise.get(DelegatingPromise.java:107)
at 
hudson.plugins.jira.JiraRestService.getIssuesFromJqlSearch(JiraRestService.java:177)
at 
hudson.plugins.jira.JiraSession.getIssuesFromJqlSearch(JiraSession.java:135)
at 
io.jenkins.blueocean.service.embedded.jira.JiraSCMListener.onChangeLogParsed(JiraSCMListener.java:43)
at 
hudson.model.listeners.SCMListener.onChangeLogParsed(SCMListener.java:120)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:590)
Caused: java.io.IOException: Failed to parse changelog
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:592)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499)
at hudson.model.Run.execute(Run.java:1724)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:421)
Retrying after 10 seconds
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 3c4a2001e5364657f6596bfc338f792e4797712d (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 3c4a2001e5364657f6596bfc338f792e4797712d
Commit message: "[BEAM-3502] Remove usage of proto.Builder.clone() in 
DatastoreIO (#4449)"
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
java.util.concurrent.TimeoutException: Timeout waiting for task.
at 
com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:259)
at 
com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:91)
at 
com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:69)
at 
com.atlassian.jira.rest.client.internal.async.DelegatingPromise.get(DelegatingPromise.java:107)
at 
hudson.plugins.jira.JiraRestService.getIssuesFromJqlSearch(JiraRestService.java:177)
at 
hudson.plugins.jira.JiraSession.getIssuesFromJqlSearch(JiraSession.java:135)
at 
io.jenkins.blueocean.service.embedded.jira.JiraSCMListener.onChangeLogParsed(JiraSCMListener.java:43)
at 
hudson.model.listeners.SCMListener.onChangeLogParsed(SCMListener.java:120)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:590)
Caused: java.io.IOException: Failed to parse changelog
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:592)
 

Build failed in Jenkins: beam_PerformanceTests_Compressed_TextIOIT #36

2018-01-19 Thread Apache Jenkins Server
See 


Changes:

[lcwik] [BEAM-3502] Remove usage of proto.Builder.clone() in DatastoreIO (#4449)

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam6 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 3c4a2001e5364657f6596bfc338f792e4797712d (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 3c4a2001e5364657f6596bfc338f792e4797712d
Commit message: "[BEAM-3502] Remove usage of proto.Builder.clone() in 
DatastoreIO (#4449)"
 > git rev-list 898277297be7160900fdb48ed486a257a6bc92c0 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
java.util.concurrent.TimeoutException: Timeout waiting for task.
at 
com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:259)
at 
com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:91)
at 
com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:69)
at 
com.atlassian.jira.rest.client.internal.async.DelegatingPromise.get(DelegatingPromise.java:107)
at 
hudson.plugins.jira.JiraRestService.getIssuesFromJqlSearch(JiraRestService.java:177)
at 
hudson.plugins.jira.JiraSession.getIssuesFromJqlSearch(JiraSession.java:135)
at 
io.jenkins.blueocean.service.embedded.jira.JiraSCMListener.onChangeLogParsed(JiraSCMListener.java:43)
at 
hudson.model.listeners.SCMListener.onChangeLogParsed(SCMListener.java:120)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:590)
Caused: java.io.IOException: Failed to parse changelog
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:592)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499)
at hudson.model.Run.execute(Run.java:1724)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:421)
Retrying after 10 seconds
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 3c4a2001e5364657f6596bfc338f792e4797712d (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 3c4a2001e5364657f6596bfc338f792e4797712d
Commit message: "[BEAM-3502] Remove usage of proto.Builder.clone() in 
DatastoreIO (#4449)"
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
java.util.concurrent.TimeoutException: Timeout waiting for task.
at 
com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:259)
at 
com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:91)
at 
com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:69)
at 
com.atlassian.jira.rest.client.internal.async.DelegatingPromise.get(DelegatingPromise.java:107)
at 
hudson.plugins.jira.JiraRestService.getIssuesFromJqlSearch(JiraRestService.java:177)
at 
hudson.plugins.jira.JiraSession.getIssuesFromJqlSearch(JiraSession.java:135)
at 
io.jenkins.blueocean.service.embedded.jira.JiraSCMListener.onChangeLogParsed(JiraSCMListener.java:43)
at 
hudson.model.listeners.SCMListener.onChangeLogParsed(SCMListener.java:120)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:590)
Caused: java.io.IOException: Failed to parse changelog
at 

Build failed in Jenkins: beam_PerformanceTests_AvroIOIT #37

2018-01-19 Thread Apache Jenkins Server
See 


Changes:

[lcwik] [BEAM-3502] Remove usage of proto.Builder.clone() in DatastoreIO (#4449)

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam8 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 3c4a2001e5364657f6596bfc338f792e4797712d (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 3c4a2001e5364657f6596bfc338f792e4797712d
Commit message: "[BEAM-3502] Remove usage of proto.Builder.clone() in 
DatastoreIO (#4449)"
 > git rev-list 898277297be7160900fdb48ed486a257a6bc92c0 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
java.util.concurrent.TimeoutException: Timeout waiting for task.
at 
com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:259)
at 
com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:91)
at 
com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:69)
at 
com.atlassian.jira.rest.client.internal.async.DelegatingPromise.get(DelegatingPromise.java:107)
at 
hudson.plugins.jira.JiraRestService.getIssuesFromJqlSearch(JiraRestService.java:177)
at 
hudson.plugins.jira.JiraSession.getIssuesFromJqlSearch(JiraSession.java:135)
at 
io.jenkins.blueocean.service.embedded.jira.JiraSCMListener.onChangeLogParsed(JiraSCMListener.java:43)
at 
hudson.model.listeners.SCMListener.onChangeLogParsed(SCMListener.java:120)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:590)
Caused: java.io.IOException: Failed to parse changelog
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:592)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499)
at hudson.model.Run.execute(Run.java:1724)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:421)
Retrying after 10 seconds
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 3c4a2001e5364657f6596bfc338f792e4797712d (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 3c4a2001e5364657f6596bfc338f792e4797712d
Commit message: "[BEAM-3502] Remove usage of proto.Builder.clone() in 
DatastoreIO (#4449)"
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
java.util.concurrent.TimeoutException: Timeout waiting for task.
at 
com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:259)
at 
com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:91)
at 
com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:69)
at 
com.atlassian.jira.rest.client.internal.async.DelegatingPromise.get(DelegatingPromise.java:107)
at 
hudson.plugins.jira.JiraRestService.getIssuesFromJqlSearch(JiraRestService.java:177)
at 
hudson.plugins.jira.JiraSession.getIssuesFromJqlSearch(JiraSession.java:135)
at 
io.jenkins.blueocean.service.embedded.jira.JiraSCMListener.onChangeLogParsed(JiraSCMListener.java:43)
at 
hudson.model.listeners.SCMListener.onChangeLogParsed(SCMListener.java:120)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:590)
Caused: java.io.IOException: Failed to parse changelog
at 

[jira] [Commented] (BEAM-3503) PubsubIO - DynamicDestinations

2018-01-19 Thread Reuven Lax (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16333167#comment-16333167
 ] 

Reuven Lax commented on BEAM-3503:
--

If you're using the Dataflow streaming runner, then a Beam-only fix is 
insufficient as Dataflow uses a separate implementation of PubSubIO in the 
runner. You'll also need to file a bug with Google to support this feature in 
the Dataflow runner.

> PubsubIO - DynamicDestinations
> --
>
> Key: BEAM-3503
> URL: https://issues.apache.org/jira/browse/BEAM-3503
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions, sdk-java-gcp
>Affects Versions: 2.2.0
>Reporter: Nalseez Duke
>Assignee: Reuven Lax
>Priority: Minor
>
> PubsubIO does not support the "DynamicDestinations" notion that is currently 
> implemented for File-based I/O and BigQueryIO.
> It would be nice if PubsubIO could also support this functionality - the 
> ability to write to a Pub/Sub topic dynamically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to stable : beam_PostCommit_Java_MavenInstall #5708

2018-01-19 Thread Apache Jenkins Server
See 




Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Dataflow #4758

2018-01-19 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Spark #3944

2018-01-19 Thread Apache Jenkins Server
See 




[jira] [Resolved] (BEAM-3502) Avoid use of proto.Builder.clone() in DatastoreIO

2018-01-19 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-3502.
-
Resolution: Fixed

> Avoid use of proto.Builder.clone() in DatastoreIO
> -
>
> Key: BEAM-3502
> URL: https://issues.apache.org/jira/browse/BEAM-3502
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Affects Versions: 2.2.0
>Reporter: Larry Li
>Assignee: Larry Li
>Priority: Minor
> Fix For: 2.3.0
>
>
> DatastoreIO uses proto.Builder.clone() here:
> [https://github.com/apache/beam/blob/c0f0e1fd63ce1e9dfe1db71adf1c8b9e88ce7038/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/datastore/DatastoreV1.java#L893]
>  
> It's only used in one place for actual runtime code, but this causes 
> incompatibility problems with Google-internal Java proto generation, i.e. we 
> get a 'NoSuchMethodError' when attempting to run the pipeline with internal 
> build tools.
>  
> This is a known problem that's already been worked around once:
> https://issues.apache.org/jira/browse/BEAM-2392
> ..but the fix only applied to BigtableServiceImpl. This extends those changes 
> to DatastoreIO, replacing its single use of clone(). Associated tests 
> shouldn't need refactoring, as this only appears as a problem at runtime.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-3504) Make streaming mobile gaming examples runnable on DataflowRunner

2018-01-19 Thread Ahmet Altay (JIRA)
Ahmet Altay created BEAM-3504:
-

 Summary: Make streaming mobile gaming examples runnable on 
DataflowRunner
 Key: BEAM-3504
 URL: https://issues.apache.org/jira/browse/BEAM-3504
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Reporter: Ahmet Altay


streaming mobile gaming examples had two issues preventing them from running on 
DataflowRunner. (1) support for --save_main_session flag was recently added, 
(2) [https://github.com/apache/beam/pull/4455] addresses the usability issue 
related to combiners.

We can now clean the warnings and test that these examples are actually running 
on Dataflow.

cc: [~angoenka] [~dcavazos]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] branch master updated: [BEAM-3502] Remove usage of proto.Builder.clone() in DatastoreIO (#4449)

2018-01-19 Thread lcwik
This is an automated email from the ASF dual-hosted git repository.

lcwik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
 new 3c4a200  [BEAM-3502] Remove usage of proto.Builder.clone() in 
DatastoreIO (#4449)
3c4a200 is described below

commit 3c4a2001e5364657f6596bfc338f792e4797712d
Author: Exprosed 
AuthorDate: Fri Jan 19 20:45:51 2018 -0500

[BEAM-3502] Remove usage of proto.Builder.clone() in DatastoreIO (#4449)
---
 .../src/main/java/org/apache/beam/sdk/io/gcp/datastore/DatastoreV1.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/datastore/DatastoreV1.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/datastore/DatastoreV1.java
index 9b20c0d..7528dde 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/datastore/DatastoreV1.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/datastore/DatastoreV1.java
@@ -890,7 +890,7 @@ public class DatastoreV1 {
 QueryResultBatch currentBatch = null;
 
 while (moreResults) {
-  Query.Builder queryBuilder = query.toBuilder().clone();
+  Query.Builder queryBuilder = query.toBuilder();
   queryBuilder.setLimit(Int32Value.newBuilder().setValue(
   Math.min(userLimit, QUERY_BATCH_LIMIT)));
 

-- 
To stop receiving notification emails like this one, please contact
['"commits@beam.apache.org" '].


[jira] [Resolved] (BEAM-3388) Reduce Go runtime reflective overhead

2018-01-19 Thread Henning Rohde (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henning Rohde resolved BEAM-3388.
-
   Resolution: Fixed
Fix Version/s: 2.3.0

> Reduce Go runtime reflective overhead
> -
>
> Key: BEAM-3388
> URL: https://issues.apache.org/jira/browse/BEAM-3388
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Henning Rohde
>Assignee: Henning Rohde
>Priority: Major
> Fix For: 2.3.0
>
>
> Go reflection is slow and we should avoid it in the Go SDK at runtime, when 
> possible -- especially on the fast paths. It seems unlikely that the language 
> runtime/libraries will improve any time soon: 
> https://github.com/golang/go/issues/7818.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] 01/01: Merge pull request #4452 from herohde/runtime8

2018-01-19 Thread altay
This is an automated email from the ASF dual-hosted git repository.

altay pushed a commit to branch go-sdk
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 76cc0a6f54c975e856aab232d4d2e979dca371ce
Merge: e019f19 92e3bbe
Author: Ahmet Altay 
AuthorDate: Fri Jan 19 17:25:08 2018 -0800

Merge pull request #4452 from herohde/runtime8

[BEAM-3388] Avoid reflect.Value conversions in Go runtime

 sdks/go/pkg/beam/core/runtime/exec/coder.go|   12 +-
 sdks/go/pkg/beam/core/runtime/exec/combine.go  |   22 +-
 sdks/go/pkg/beam/core/runtime/exec/emit.go |   14 +-
 sdks/go/pkg/beam/core/runtime/exec/fn.go   |   20 +-
 sdks/go/pkg/beam/core/runtime/exec/fn_test.go  |   44 +-
 sdks/go/pkg/beam/core/runtime/exec/fullvalue.go|   33 +-
 .../pkg/beam/core/runtime/exec/fullvalue_test.go   |   24 +-
 sdks/go/pkg/beam/core/runtime/exec/input.go|   24 +-
 .../beam/core/runtime/exec/optimized/emitters.go   | 4052 ++---
 .../beam/core/runtime/exec/optimized/emitters.tmpl |   20 +-
 .../pkg/beam/core/runtime/exec/optimized/inputs.go | 5988 ++--
 .../beam/core/runtime/exec/optimized/inputs.tmpl   |   24 +-
 sdks/go/pkg/beam/core/runtime/exec/pardo.go|3 +-
 sdks/go/pkg/beam/core/runtime/exec/unit.go |2 -
 sdks/go/pkg/beam/runners/direct/impulse.go |3 +-
 sdks/go/pkg/beam/testing/passert/passert.go|2 +-
 sdks/go/pkg/beam/transforms/top/top.go |4 +-
 17 files changed, 5139 insertions(+), 5152 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" .


[beam] branch go-sdk updated (e019f19 -> 76cc0a6)

2018-01-19 Thread altay
This is an automated email from the ASF dual-hosted git repository.

altay pushed a change to branch go-sdk
in repository https://gitbox.apache.org/repos/asf/beam.git.


from e019f19  CR: fix comments
 add 92e3bbe  Avoid reflect.Value conversions in Go runtime
 new 76cc0a6  Merge pull request #4452 from herohde/runtime8

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 sdks/go/pkg/beam/core/runtime/exec/coder.go|   12 +-
 sdks/go/pkg/beam/core/runtime/exec/combine.go  |   22 +-
 sdks/go/pkg/beam/core/runtime/exec/emit.go |   14 +-
 sdks/go/pkg/beam/core/runtime/exec/fn.go   |   20 +-
 sdks/go/pkg/beam/core/runtime/exec/fn_test.go  |   44 +-
 sdks/go/pkg/beam/core/runtime/exec/fullvalue.go|   33 +-
 .../pkg/beam/core/runtime/exec/fullvalue_test.go   |   24 +-
 sdks/go/pkg/beam/core/runtime/exec/input.go|   24 +-
 .../beam/core/runtime/exec/optimized/emitters.go   | 4052 ++---
 .../beam/core/runtime/exec/optimized/emitters.tmpl |   20 +-
 .../pkg/beam/core/runtime/exec/optimized/inputs.go | 5988 ++--
 .../beam/core/runtime/exec/optimized/inputs.tmpl   |   24 +-
 sdks/go/pkg/beam/core/runtime/exec/pardo.go|3 +-
 sdks/go/pkg/beam/core/runtime/exec/unit.go |2 -
 sdks/go/pkg/beam/runners/direct/impulse.go |3 +-
 sdks/go/pkg/beam/testing/passert/passert.go|2 +-
 sdks/go/pkg/beam/transforms/top/top.go |4 +-
 17 files changed, 5139 insertions(+), 5152 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
['"commits@beam.apache.org" '].


Jenkins build became unstable: beam_PostCommit_Java_MavenInstall #5707

2018-01-19 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : beam_PerformanceTests_TextIOIT #43

2018-01-19 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-3480) Jenkins IOIT jobs unstable

2018-01-19 Thread Chamikara Jayalath (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16333069#comment-16333069
 ] 

Chamikara Jayalath commented on BEAM-3480:
--

cc: [~alanmyrvold]

> Jenkins IOIT jobs unstable
> --
>
> Key: BEAM-3480
> URL: https://issues.apache.org/jira/browse/BEAM-3480
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Dariusz Aniszewski
>Assignee: Jason Kuster
>Priority: Major
>
> From some time, Jenkins jobs running performance tests via PerfKit, are 
> failing due to permission issues while creating temporary directory, ie:
> {noformat}
> OSError: [Errno 13] Permission denied: '/tmp/perfkitbenchmarker/runs/cf0322bf'
> {noformat}
>  
> It is possible that this is related to worker's configurations. I noticed 
> that there are few workers on which tests are passing, and few that are 
> constantly failing. See list here:
> [https://docs.google.com/spreadsheets/d/1rvJjrR3BC9hwvpfWOszDoGRA7hr1Vke7PbitwQEKk0Q/edit#gid=0]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_Python #811

2018-01-19 Thread Apache Jenkins Server
See 


Changes:

[batbat] Moved floating point equality findbugs annotation from generic xml file

[melissapa] [BEAM-3351] Fix Javadoc formatting issues

[melissapa] Update Context references to links

[tgroh] Implement a GRPC Provision Service

[tgroh] Add InboundDataClient

[lcwik] Use platformThreadFactory for default thread pool.

[robertwb] [BEAM-3490] Wrap DistributionData in a DistributionResult for

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam1 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 898277297be7160900fdb48ed486a257a6bc92c0 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 898277297be7160900fdb48ed486a257a6bc92c0
Commit message: "Merge pull request #4450 [BEAM-3490] Wrap DistributionData in 
a DistributionResult"
 > git rev-list a91f7ada2075b8d6adbbb7002a15f1078d8eae45 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins5835360861966226757.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins143722234137480.sh
+ rm -rf .env
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins4062026425642829119.sh
+ virtualenv .env --system-site-packages
New python executable in 

Installing setuptools, pip, wheel...done.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins5221037234891501694.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins6219418177846044991.sh
+ .env/bin/pip install -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied: absl-py in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: setuptools in ./.env/lib/python2.7/site-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 16))
Requirement already satisfied: colorlog[windows]==2.6.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied: futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied: PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied: pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied: numpy in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied: functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied: contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Requirement already satisfied: pywinrm in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: six in 
/home/jenkins/.local/lib/python2.7/site-packages (from absl-py->-r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: MarkupSafe in 
/usr/local/lib/python2.7/dist-packages (from jinja2>=2.7->-r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: colorama; extra == "windows" in 

Build failed in Jenkins: beam_PerformanceTests_Spark #1255

2018-01-19 Thread Apache Jenkins Server
See 


Changes:

[batbat] Moved floating point equality findbugs annotation from generic xml file

[melissapa] [BEAM-3351] Fix Javadoc formatting issues

[melissapa] Update Context references to links

[tgroh] Implement a GRPC Provision Service

[tgroh] Add InboundDataClient

[lcwik] Use platformThreadFactory for default thread pool.

[robertwb] [BEAM-3490] Wrap DistributionData in a DistributionResult for

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam1 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 898277297be7160900fdb48ed486a257a6bc92c0 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 898277297be7160900fdb48ed486a257a6bc92c0
Commit message: "Merge pull request #4450 [BEAM-3490] Wrap DistributionData in 
a DistributionResult"
 > git rev-list a91f7ada2075b8d6adbbb7002a15f1078d8eae45 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Spark] $ /bin/bash -xe /tmp/jenkins680999398035055205.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_Spark] $ /bin/bash -xe /tmp/jenkins10440891515107820.sh
+ rm -rf .env
[beam_PerformanceTests_Spark] $ /bin/bash -xe /tmp/jenkins51027454890461840.sh
+ virtualenv .env --system-site-packages
New python executable in 

Installing setuptools, pip, wheel...done.
[beam_PerformanceTests_Spark] $ /bin/bash -xe /tmp/jenkins3512248596255112443.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Spark] $ /bin/bash -xe /tmp/jenkins343565189231313576.sh
+ .env/bin/pip install -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied: absl-py in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: setuptools in ./.env/lib/python2.7/site-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 16))
Requirement already satisfied: colorlog[windows]==2.6.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied: futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied: PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied: pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied: numpy in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied: functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied: contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Requirement already satisfied: pywinrm in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: six in 
/home/jenkins/.local/lib/python2.7/site-packages (from absl-py->-r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: MarkupSafe in 
/usr/local/lib/python2.7/dist-packages (from jinja2>=2.7->-r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: colorama; extra == "windows" in 

Build failed in Jenkins: beam_PerformanceTests_TFRecordIOIT #35

2018-01-19 Thread Apache Jenkins Server
See 


Changes:

[batbat] Moved floating point equality findbugs annotation from generic xml file

[melissapa] [BEAM-3351] Fix Javadoc formatting issues

[melissapa] Update Context references to links

[tgroh] Implement a GRPC Provision Service

[tgroh] Add InboundDataClient

[lcwik] Use platformThreadFactory for default thread pool.

[robertwb] [BEAM-3490] Wrap DistributionData in a DistributionResult for

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam1 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 898277297be7160900fdb48ed486a257a6bc92c0 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 898277297be7160900fdb48ed486a257a6bc92c0
Commit message: "Merge pull request #4450 [BEAM-3490] Wrap DistributionData in 
a DistributionResult"
 > git rev-list a91f7ada2075b8d6adbbb7002a15f1078d8eae45 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_TFRecordIOIT] $ /bin/bash -xe 
/tmp/jenkins2187657405908038015.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_TFRecordIOIT] $ /bin/bash -xe 
/tmp/jenkins2486978036932494730.sh
+ rm -rf .env
[beam_PerformanceTests_TFRecordIOIT] $ /bin/bash -xe 
/tmp/jenkins469468508567616760.sh
+ virtualenv .env --system-site-packages
New python executable in 

Installing setuptools, pip, wheel...done.
[beam_PerformanceTests_TFRecordIOIT] $ /bin/bash -xe 
/tmp/jenkins339579970817741014.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_TFRecordIOIT] $ /bin/bash -xe 
/tmp/jenkins1465903723274529662.sh
+ .env/bin/pip install -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied: absl-py in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: setuptools in ./.env/lib/python2.7/site-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 16))
Requirement already satisfied: colorlog[windows]==2.6.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied: futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied: PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied: pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied: numpy in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied: functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied: contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Requirement already satisfied: pywinrm in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: six in 
/home/jenkins/.local/lib/python2.7/site-packages (from absl-py->-r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: MarkupSafe in 
/usr/local/lib/python2.7/dist-packages (from jinja2>=2.7->-r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already 

Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Spark #3943

2018-01-19 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PerformanceTests_Compressed_TextIOIT #35

2018-01-19 Thread Apache Jenkins Server
See 


Changes:

[batbat] Moved floating point equality findbugs annotation from generic xml file

[melissapa] [BEAM-3351] Fix Javadoc formatting issues

[melissapa] Update Context references to links

[tgroh] Implement a GRPC Provision Service

[tgroh] Add InboundDataClient

[lcwik] Use platformThreadFactory for default thread pool.

[robertwb] [BEAM-3490] Wrap DistributionData in a DistributionResult for

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam1 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 898277297be7160900fdb48ed486a257a6bc92c0 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 898277297be7160900fdb48ed486a257a6bc92c0
Commit message: "Merge pull request #4450 [BEAM-3490] Wrap DistributionData in 
a DistributionResult"
 > git rev-list a91f7ada2075b8d6adbbb7002a15f1078d8eae45 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Compressed_TextIOIT] $ /bin/bash -xe 
/tmp/jenkins2975383172716858791.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_Compressed_TextIOIT] $ /bin/bash -xe 
/tmp/jenkins1326414230531066822.sh
+ rm -rf .env
[beam_PerformanceTests_Compressed_TextIOIT] $ /bin/bash -xe 
/tmp/jenkins386289241725677.sh
+ virtualenv .env --system-site-packages
New python executable in 

Installing setuptools, pip, wheel...done.
[beam_PerformanceTests_Compressed_TextIOIT] $ /bin/bash -xe 
/tmp/jenkins3503942854064575772.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Compressed_TextIOIT] $ /bin/bash -xe 
/tmp/jenkins3083472244633488033.sh
+ .env/bin/pip install -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied: absl-py in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: setuptools in ./.env/lib/python2.7/site-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 16))
Requirement already satisfied: colorlog[windows]==2.6.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied: futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied: PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied: pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied: numpy in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied: functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied: contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Requirement already satisfied: pywinrm in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: six in 
/home/jenkins/.local/lib/python2.7/site-packages (from absl-py->-r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: MarkupSafe in 
/usr/local/lib/python2.7/dist-packages (from jinja2>=2.7->-r 

Build failed in Jenkins: beam_PerformanceTests_AvroIOIT #36

2018-01-19 Thread Apache Jenkins Server
See 


Changes:

[batbat] Moved floating point equality findbugs annotation from generic xml file

[melissapa] [BEAM-3351] Fix Javadoc formatting issues

[melissapa] Update Context references to links

[tgroh] Implement a GRPC Provision Service

[tgroh] Add InboundDataClient

[lcwik] Use platformThreadFactory for default thread pool.

[robertwb] [BEAM-3490] Wrap DistributionData in a DistributionResult for

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam1 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 898277297be7160900fdb48ed486a257a6bc92c0 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 898277297be7160900fdb48ed486a257a6bc92c0
Commit message: "Merge pull request #4450 [BEAM-3490] Wrap DistributionData in 
a DistributionResult"
 > git rev-list a91f7ada2075b8d6adbbb7002a15f1078d8eae45 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_AvroIOIT] $ /bin/bash -xe 
/tmp/jenkins8434807541768946444.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_AvroIOIT] $ /bin/bash -xe 
/tmp/jenkins7847789217083601198.sh
+ rm -rf .env
[beam_PerformanceTests_AvroIOIT] $ /bin/bash -xe 
/tmp/jenkins218703169749260.sh
+ virtualenv .env --system-site-packages
New python executable in 

Installing setuptools, pip, wheel...done.
[beam_PerformanceTests_AvroIOIT] $ /bin/bash -xe 
/tmp/jenkins1976952698185606208.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_AvroIOIT] $ /bin/bash -xe 
/tmp/jenkins7141304261411922329.sh
+ .env/bin/pip install -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied: absl-py in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: setuptools in ./.env/lib/python2.7/site-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 16))
Requirement already satisfied: colorlog[windows]==2.6.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied: futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied: PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied: pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied: numpy in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied: functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied: contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Requirement already satisfied: pywinrm in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: six in 
/home/jenkins/.local/lib/python2.7/site-packages (from absl-py->-r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: MarkupSafe in 
/usr/local/lib/python2.7/dist-packages (from jinja2>=2.7->-r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: colorama; extra == 

Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #4757

2018-01-19 Thread Apache Jenkins Server
See 




[jira] [Comment Edited] (BEAM-3503) PubsubIO - DynamicDestinations

2018-01-19 Thread Nalseez Duke (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16333043#comment-16333043
 ] 

Nalseez Duke edited comment on BEAM-3503 at 1/20/18 12:20 AM:
--

The DataflowRunner is what I primarily work with/am interested in.


was (Author: nalseez):
The DataflowRunner is what I primarily work with.

> PubsubIO - DynamicDestinations
> --
>
> Key: BEAM-3503
> URL: https://issues.apache.org/jira/browse/BEAM-3503
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions, sdk-java-gcp
>Affects Versions: 2.2.0
>Reporter: Nalseez Duke
>Assignee: Reuven Lax
>Priority: Minor
>
> PubsubIO does not support the "DynamicDestinations" notion that is currently 
> implemented for File-based I/O and BigQueryIO.
> It would be nice if PubsubIO could also support this functionality - the 
> ability to write to a Pub/Sub topic dynamically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3503) PubsubIO - DynamicDestinations

2018-01-19 Thread Nalseez Duke (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16333043#comment-16333043
 ] 

Nalseez Duke commented on BEAM-3503:


The DataflowRunner is what I primarily work with.

> PubsubIO - DynamicDestinations
> --
>
> Key: BEAM-3503
> URL: https://issues.apache.org/jira/browse/BEAM-3503
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions, sdk-java-gcp
>Affects Versions: 2.2.0
>Reporter: Nalseez Duke
>Assignee: Reuven Lax
>Priority: Minor
>
> PubsubIO does not support the "DynamicDestinations" notion that is currently 
> implemented for File-based I/O and BigQueryIO.
> It would be nice if PubsubIO could also support this functionality - the 
> ability to write to a Pub/Sub topic dynamically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] branch master updated (49c839c -> 8982772)

2018-01-19 Thread robertwb
This is an automated email from the ASF dual-hosted git repository.

robertwb pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 49c839c  Merge pull request #4427
 add 1f1904d  [BEAM-3490] Wrap DistributionData in a DistributionResult for 
FnApiRunner.
 new 8982772  Merge pull request #4450 [BEAM-3490] Wrap DistributionData in 
a DistributionResult

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 sdks/python/apache_beam/runners/portability/fn_api_runner.py  | 5 +++--
 sdks/python/apache_beam/runners/portability/fn_api_runner_test.py | 3 ++-
 2 files changed, 5 insertions(+), 3 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
['"commits@beam.apache.org" '].


[beam] 01/01: Merge pull request #4450 [BEAM-3490] Wrap DistributionData in a DistributionResult

2018-01-19 Thread robertwb
This is an automated email from the ASF dual-hosted git repository.

robertwb pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 898277297be7160900fdb48ed486a257a6bc92c0
Merge: 49c839c 1f1904d
Author: Robert Bradshaw 
AuthorDate: Fri Jan 19 16:14:35 2018 -0800

Merge pull request #4450 [BEAM-3490] Wrap DistributionData in a 
DistributionResult

 sdks/python/apache_beam/runners/portability/fn_api_runner.py  | 5 +++--
 sdks/python/apache_beam/runners/portability/fn_api_runner_test.py | 3 ++-
 2 files changed, 5 insertions(+), 3 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" .


[jira] [Commented] (BEAM-3503) PubsubIO - DynamicDestinations

2018-01-19 Thread Reuven Lax (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16333024#comment-16333024
 ] 

Reuven Lax commented on BEAM-3503:
--

which runner are you interested in?

> PubsubIO - DynamicDestinations
> --
>
> Key: BEAM-3503
> URL: https://issues.apache.org/jira/browse/BEAM-3503
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions, sdk-java-gcp
>Affects Versions: 2.2.0
>Reporter: Nalseez Duke
>Assignee: Reuven Lax
>Priority: Minor
>
> PubsubIO does not support the "DynamicDestinations" notion that is currently 
> implemented for File-based I/O and BigQueryIO.
> It would be nice if PubsubIO could also support this functionality - the 
> ability to write to a Pub/Sub topic dynamically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] branch master updated (6b6800c -> 49c839c)

2018-01-19 Thread tgroh
This is an automated email from the ASF dual-hosted git repository.

tgroh pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 6b6800c  Merge pull request #4421
 add 064bf5e  Add InboundDataClient
 new 49c839c  Merge pull request #4427

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../runners/fnexecution/data/FnDataService.java|   5 +-
 .../src/main/resources/beam/findbugs-filter.xml|  11 ++
 .../apache/beam/sdk/fn/data/InboundDataClient.java |  59 
 .../beam/fn/harness/BeamFnDataReadRunner.java  |   6 +-
 .../beam/fn/harness/data/BeamFnDataClient.java |   7 +-
 .../beam/fn/harness/data/BeamFnDataGrpcClient.java |  18 ++-
 .../fn/harness/data/BeamFnDataInboundObserver.java |  49 +-
 .../data/CompletableFutureInboundDataClient.java   |  73 +
 .../beam/fn/harness/BeamFnDataReadRunnerTest.java  |  19 +--
 .../fn/harness/data/BeamFnDataGrpcClientTest.java  |  28 ++--
 .../data/BeamFnDataInboundObserverTest.java|  30 ++--
 .../CompletableFutureInboundDataClientTest.java| 166 +
 12 files changed, 407 insertions(+), 64 deletions(-)
 create mode 100644 
sdks/java/fn-execution/src/main/java/org/apache/beam/sdk/fn/data/InboundDataClient.java
 create mode 100644 
sdks/java/harness/src/main/java/org/apache/beam/fn/harness/data/CompletableFutureInboundDataClient.java
 create mode 100644 
sdks/java/harness/src/test/java/org/apache/beam/fn/harness/data/CompletableFutureInboundDataClientTest.java

-- 
To stop receiving notification emails like this one, please contact
['"commits@beam.apache.org" '].


[beam] 01/01: Merge pull request #4427

2018-01-19 Thread tgroh
This is an automated email from the ASF dual-hosted git repository.

tgroh pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 49c839cc425708a6cf51748f40aca050cd64f16e
Merge: 6b6800c 064bf5e
Author: Thomas Groh 
AuthorDate: Fri Jan 19 15:54:34 2018 -0800

Merge pull request #4427

Add InboundDataClient

 .../runners/fnexecution/data/FnDataService.java|   5 +-
 .../src/main/resources/beam/findbugs-filter.xml|  11 ++
 .../apache/beam/sdk/fn/data/InboundDataClient.java |  59 
 .../beam/fn/harness/BeamFnDataReadRunner.java  |   6 +-
 .../beam/fn/harness/data/BeamFnDataClient.java |   7 +-
 .../beam/fn/harness/data/BeamFnDataGrpcClient.java |  18 ++-
 .../fn/harness/data/BeamFnDataInboundObserver.java |  49 +-
 .../data/CompletableFutureInboundDataClient.java   |  73 +
 .../beam/fn/harness/BeamFnDataReadRunnerTest.java  |  19 +--
 .../fn/harness/data/BeamFnDataGrpcClientTest.java  |  28 ++--
 .../data/BeamFnDataInboundObserverTest.java|  30 ++--
 .../CompletableFutureInboundDataClientTest.java| 166 +
 12 files changed, 407 insertions(+), 64 deletions(-)


-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" .


[beam] 01/04: Renamed Go runtime Caller to Func and added name

2018-01-19 Thread lcwik
This is an automated email from the ASF dual-hosted git repository.

lcwik pushed a commit to branch go-sdk
in repository https://gitbox.apache.org/repos/asf/beam.git

commit c67de937c7d732d68ef363ff46c18b77916938d5
Author: Henning Rohde 
AuthorDate: Fri Jan 12 15:13:26 2018 -0800

Renamed Go runtime Caller to Func and added name
---
 sdks/go/pkg/beam/core/runtime/exec/callers.go   | 400 ++
 sdks/go/pkg/beam/core/runtime/exec/callers.tmpl |  18 +-
 sdks/go/pkg/beam/core/runtime/exec/combine.go   |   4 +-
 sdks/go/pkg/beam/core/util/reflectx/call.go |  51 +-
 sdks/go/pkg/beam/core/util/reflectx/calls.go| 968 ++--
 sdks/go/pkg/beam/core/util/reflectx/calls.tmpl  |  38 +-
 sdks/go/pkg/beam/core/util/reflectx/json.go |   2 +-
 sdks/go/pkg/beam/transforms/filter/filter.go|   4 +-
 8 files changed, 858 insertions(+), 627 deletions(-)

diff --git a/sdks/go/pkg/beam/core/runtime/exec/callers.go 
b/sdks/go/pkg/beam/core/runtime/exec/callers.go
index 8c374bf..b47751d 100644
--- a/sdks/go/pkg/beam/core/runtime/exec/callers.go
+++ b/sdks/go/pkg/beam/core/runtime/exec/callers.go
@@ -35,510 +35,598 @@ import (
 // For now, we just do #2.
 
 func init() {
-   reflectx.RegisterCaller(reflect.TypeOf((*func([]byte, []byte) 
[]byte)(nil)).Elem(), callMakerByteSliceM)
-   reflectx.RegisterCaller(reflect.TypeOf((*func(bool, bool) 
bool)(nil)).Elem(), callMakerBoolM)
-   reflectx.RegisterCaller(reflect.TypeOf((*func(string, string) 
string)(nil)).Elem(), callMakerStringM)
-   reflectx.RegisterCaller(reflect.TypeOf((*func(int, int) 
int)(nil)).Elem(), callMakerIntM)
-   reflectx.RegisterCaller(reflect.TypeOf((*func(int8, int8) 
int8)(nil)).Elem(), callMakerInt8M)
-   reflectx.RegisterCaller(reflect.TypeOf((*func(int16, int16) 
int16)(nil)).Elem(), callMakerInt16M)
-   reflectx.RegisterCaller(reflect.TypeOf((*func(int32, int32) 
int32)(nil)).Elem(), callMakerInt32M)
-   reflectx.RegisterCaller(reflect.TypeOf((*func(int64, int64) 
int64)(nil)).Elem(), callMakerInt64M)
-   reflectx.RegisterCaller(reflect.TypeOf((*func(uint, uint) 
uint)(nil)).Elem(), callMakerUintM)
-   reflectx.RegisterCaller(reflect.TypeOf((*func(uint8, uint8) 
uint8)(nil)).Elem(), callMakerUint8M)
-   reflectx.RegisterCaller(reflect.TypeOf((*func(uint16, uint16) 
uint16)(nil)).Elem(), callMakerUint16M)
-   reflectx.RegisterCaller(reflect.TypeOf((*func(uint32, uint32) 
uint32)(nil)).Elem(), callMakerUint32M)
-   reflectx.RegisterCaller(reflect.TypeOf((*func(uint64, uint64) 
uint64)(nil)).Elem(), callMakerUint64M)
-   reflectx.RegisterCaller(reflect.TypeOf((*func(float32, float32) 
float32)(nil)).Elem(), callMakerFloat32M)
-   reflectx.RegisterCaller(reflect.TypeOf((*func(float64, float64) 
float64)(nil)).Elem(), callMakerFloat64M)
-   reflectx.RegisterCaller(reflect.TypeOf((*func(typex.T, typex.T) 
typex.T)(nil)).Elem(), callMakerTypex_TM)
-   reflectx.RegisterCaller(reflect.TypeOf((*func(typex.U, typex.U) 
typex.U)(nil)).Elem(), callMakerTypex_UM)
-   reflectx.RegisterCaller(reflect.TypeOf((*func(typex.V, typex.V) 
typex.V)(nil)).Elem(), callMakerTypex_VM)
-   reflectx.RegisterCaller(reflect.TypeOf((*func(typex.W, typex.W) 
typex.W)(nil)).Elem(), callMakerTypex_WM)
-   reflectx.RegisterCaller(reflect.TypeOf((*func(typex.X, typex.X) 
typex.X)(nil)).Elem(), callMakerTypex_XM)
-   reflectx.RegisterCaller(reflect.TypeOf((*func(typex.Y, typex.Y) 
typex.Y)(nil)).Elem(), callMakerTypex_YM)
-   reflectx.RegisterCaller(reflect.TypeOf((*func(typex.Z, typex.Z) 
typex.Z)(nil)).Elem(), callMakerTypex_ZM)
-}
-
-type nativeByteSliceMCaller struct {
+   reflectx.RegisterFunc(reflect.TypeOf((*func([]byte, []byte) 
[]byte)(nil)).Elem(), funcMakerByteSliceM)
+   reflectx.RegisterFunc(reflect.TypeOf((*func(bool, bool) 
bool)(nil)).Elem(), funcMakerBoolM)
+   reflectx.RegisterFunc(reflect.TypeOf((*func(string, string) 
string)(nil)).Elem(), funcMakerStringM)
+   reflectx.RegisterFunc(reflect.TypeOf((*func(int, int) 
int)(nil)).Elem(), funcMakerIntM)
+   reflectx.RegisterFunc(reflect.TypeOf((*func(int8, int8) 
int8)(nil)).Elem(), funcMakerInt8M)
+   reflectx.RegisterFunc(reflect.TypeOf((*func(int16, int16) 
int16)(nil)).Elem(), funcMakerInt16M)
+   reflectx.RegisterFunc(reflect.TypeOf((*func(int32, int32) 
int32)(nil)).Elem(), funcMakerInt32M)
+   reflectx.RegisterFunc(reflect.TypeOf((*func(int64, int64) 
int64)(nil)).Elem(), funcMakerInt64M)
+   reflectx.RegisterFunc(reflect.TypeOf((*func(uint, uint) 
uint)(nil)).Elem(), funcMakerUintM)
+   reflectx.RegisterFunc(reflect.TypeOf((*func(uint8, uint8) 
uint8)(nil)).Elem(), funcMakerUint8M)
+   reflectx.RegisterFunc(reflect.TypeOf((*func(uint16, uint16) 
uint16)(nil)).Elem(), funcMakerUint16M)
+   reflectx.RegisterFunc(reflect.TypeOf((*func(uint32, uint32) 
uint32)(nil)).Elem(), funcMakerUint32M)
+   

[beam] branch go-sdk updated (c5a3ce0 -> e019f19)

2018-01-19 Thread lcwik
This is an automated email from the ASF dual-hosted git repository.

lcwik pushed a change to branch go-sdk
in repository https://gitbox.apache.org/repos/asf/beam.git.


from c5a3ce0  Add initialization of active plans map.
 new c67de93  Renamed Go runtime Caller to Func and added name
 new 09e98e4  Use reflectx.Func as the fundamental function representation
 new 8a9d916  CR: fix DynFn comments
 new e019f19  CR: fix comments

The 4 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 sdks/go/cmd/symtab/main.go | 4 +-
 sdks/go/pkg/beam/core/funcx/fn.go  |37 +-
 sdks/go/pkg/beam/core/funcx/fn_test.go | 3 +-
 sdks/go/pkg/beam/core/graph/bind_test.go   | 4 +-
 sdks/go/pkg/beam/core/graph/coder/coder.go | 8 +-
 sdks/go/pkg/beam/core/graph/edge.go| 4 +-
 sdks/go/pkg/beam/core/graph/fn.go  |34 +-
 sdks/go/pkg/beam/core/runtime/exec/callers.go  |   544 -
 sdks/go/pkg/beam/core/runtime/exec/coder.go| 8 +-
 sdks/go/pkg/beam/core/runtime/exec/combine.go  | 4 +-
 sdks/go/pkg/beam/core/runtime/exec/decode.go   |   108 +-
 sdks/go/pkg/beam/core/runtime/exec/decoders.go |  1347 --
 sdks/go/pkg/beam/core/runtime/exec/decoders.tmpl   |89 -
 sdks/go/pkg/beam/core/runtime/exec/emit.go | 2 -
 sdks/go/pkg/beam/core/runtime/exec/emitters.go | 14214 --
 sdks/go/pkg/beam/core/runtime/exec/encode.go   |   109 +-
 sdks/go/pkg/beam/core/runtime/exec/encoders.go |  1171 --
 sdks/go/pkg/beam/core/runtime/exec/encoders.tmpl   |81 -
 sdks/go/pkg/beam/core/runtime/exec/fn.go   |19 +-
 sdks/go/pkg/beam/core/runtime/exec/fn_test.go  |15 +-
 sdks/go/pkg/beam/core/runtime/exec/input.go| 2 -
 .../beam/core/runtime/exec/optimized/callers.go|   632 +
 .../core/runtime/exec/{ => optimized}/callers.tmpl |24 +-
 .../beam/core/runtime/exec/optimized/decoders.go   |  2407 
 .../beam/core/runtime/exec/optimized/decoders.tmpl |   146 +
 .../beam/core/runtime/exec/optimized/emitters.go   | 14215 +++
 .../runtime/exec/{ => optimized}/emitters.tmpl |32 +-
 .../beam/core/runtime/exec/optimized/encoders.go   |  2299 +++
 .../beam/core/runtime/exec/optimized/encoders.tmpl |   146 +
 .../scope.go => runtime/exec/optimized/gen.go} |30 +-
 .../core/runtime/exec/{ => optimized}/inputs.go|  4055 +++---
 .../core/runtime/exec/{ => optimized}/inputs.tmpl  |26 +-
 sdks/go/pkg/beam/core/runtime/exec/util.go |11 -
 sdks/go/pkg/beam/core/runtime/graphx/serialize.go  |11 +-
 sdks/go/pkg/beam/core/runtime/graphx/translate.go  | 2 +-
 sdks/go/pkg/beam/core/runtime/graphx/user.go   |11 +-
 sdks/go/pkg/beam/core/util/reflectx/call.go|64 +-
 sdks/go/pkg/beam/core/util/reflectx/calls.go   |   968 +-
 sdks/go/pkg/beam/core/util/reflectx/calls.tmpl |38 +-
 sdks/go/pkg/beam/core/util/reflectx/json.go| 2 +-
 sdks/go/pkg/beam/encoding.go   |23 +-
 sdks/go/pkg/beam/partition.go  |62 +-
 sdks/go/pkg/beam/transforms/filter/filter.go   |10 +-
 sdks/go/pkg/beam/transforms/top/top.go |33 +-
 sdks/go/pkg/beam/transforms/top/top_test.go| 7 +-
 sdks/go/pkg/beam/x/beamx/run.go| 2 +
 46 files changed, 22823 insertions(+), 20240 deletions(-)
 delete mode 100644 sdks/go/pkg/beam/core/runtime/exec/callers.go
 delete mode 100644 sdks/go/pkg/beam/core/runtime/exec/decoders.go
 delete mode 100644 sdks/go/pkg/beam/core/runtime/exec/decoders.tmpl
 delete mode 100644 sdks/go/pkg/beam/core/runtime/exec/emitters.go
 delete mode 100644 sdks/go/pkg/beam/core/runtime/exec/encoders.go
 delete mode 100644 sdks/go/pkg/beam/core/runtime/exec/encoders.tmpl
 create mode 100644 sdks/go/pkg/beam/core/runtime/exec/optimized/callers.go
 rename sdks/go/pkg/beam/core/runtime/exec/{ => optimized}/callers.tmpl (72%)
 create mode 100644 sdks/go/pkg/beam/core/runtime/exec/optimized/decoders.go
 create mode 100644 sdks/go/pkg/beam/core/runtime/exec/optimized/decoders.tmpl
 create mode 100644 sdks/go/pkg/beam/core/runtime/exec/optimized/emitters.go
 rename sdks/go/pkg/beam/core/runtime/exec/{ => optimized}/emitters.tmpl (64%)
 create mode 100644 sdks/go/pkg/beam/core/runtime/exec/optimized/encoders.go
 create mode 100644 sdks/go/pkg/beam/core/runtime/exec/optimized/encoders.tmpl
 copy sdks/go/pkg/beam/core/{graph/scope.go => runtime/exec/optimized/gen.go} 
(51%)
 rename sdks/go/pkg/beam/core/runtime/exec/{ => optimized}/inputs.go (70%)
 rename sdks/go/pkg/beam/core/runtime/exec/{ => optimized}/inputs.tmpl (76%)

-- 
To stop receiving notification emails 

[beam] 03/04: CR: fix DynFn comments

2018-01-19 Thread lcwik
This is an automated email from the ASF dual-hosted git repository.

lcwik pushed a commit to branch go-sdk
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 8a9d916be7b9945e9b7576dbf52f0a80b3249cbd
Author: Henning Rohde 
AuthorDate: Thu Jan 18 19:04:21 2018 -0800

CR: fix DynFn comments
---
 sdks/go/pkg/beam/core/graph/fn.go| 4 +++-
 sdks/go/pkg/beam/core/runtime/exec/fn.go | 4 
 2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/sdks/go/pkg/beam/core/graph/fn.go 
b/sdks/go/pkg/beam/core/graph/fn.go
index 032227f..6ed05a1 100644
--- a/sdks/go/pkg/beam/core/graph/fn.go
+++ b/sdks/go/pkg/beam/core/graph/fn.go
@@ -61,7 +61,9 @@ type DynFn struct {
Name string
// T is the type of the generated function
T reflect.Type
-   // Data holds the data for the generator. This data is trivially 
serializable.
+   // Data holds the data, if any, for the generator. Each function
+   // generator typically needs some configuration data, which is
+   // required by the DynFn to be encoded.
Data []byte
// Gen is the function generator. The function generator itself must be 
a
// function with a unique symbol.
diff --git a/sdks/go/pkg/beam/core/runtime/exec/fn.go 
b/sdks/go/pkg/beam/core/runtime/exec/fn.go
index 49a9bac..9ad422f 100644
--- a/sdks/go/pkg/beam/core/runtime/exec/fn.go
+++ b/sdks/go/pkg/beam/core/runtime/exec/fn.go
@@ -26,10 +26,6 @@ import (
"github.com/apache/beam/sdks/go/pkg/beam/core/util/reflectx"
 )
 
-// NOTE(herohde) 12/11/2017: the below helpers are ripe for 
type-specialization,
-// if the reflection overhead here turns out to be significant. It would
-// be nice to be able to quantify any potential improvements first, however.
-
 // MainInput is the main input and is unfolded in the invocation, if present.
 type MainInput struct {
KeyFullValue

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" .


[beam] 04/04: CR: fix comments

2018-01-19 Thread lcwik
This is an automated email from the ASF dual-hosted git repository.

lcwik pushed a commit to branch go-sdk
in repository https://gitbox.apache.org/repos/asf/beam.git

commit e019f199cca2e98f9dc8e544d5e2c6796b1be3dc
Author: Henning Rohde 
AuthorDate: Fri Jan 19 15:45:48 2018 -0800

CR: fix comments
---
 sdks/go/pkg/beam/core/runtime/exec/encode.go | 2 +-
 sdks/go/pkg/beam/core/util/reflectx/call.go  | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/sdks/go/pkg/beam/core/runtime/exec/encode.go 
b/sdks/go/pkg/beam/core/runtime/exec/encode.go
index 2940578..7f6bbee 100644
--- a/sdks/go/pkg/beam/core/runtime/exec/encode.go
+++ b/sdks/go/pkg/beam/core/runtime/exec/encode.go
@@ -30,7 +30,7 @@ type Encoder interface {
 }
 
 func makeEncoder(fn reflectx.Func) Encoder {
-   // Detect one of the valid decoder forms and allow it to be invoked
+   // Detect one of the valid encoder forms and allow it to be invoked
// efficiently, relying on the general reflectx.Func framework for
// avoiding expensive reflection calls. There are 4 forms:
//
diff --git a/sdks/go/pkg/beam/core/util/reflectx/call.go 
b/sdks/go/pkg/beam/core/util/reflectx/call.go
index d011ec5..fbca8e3 100644
--- a/sdks/go/pkg/beam/core/util/reflectx/call.go
+++ b/sdks/go/pkg/beam/core/util/reflectx/call.go
@@ -45,8 +45,8 @@ var (
funcsMu sync.Mutex
 )
 
-// RegisterFunc registers an custom reflectFunc factory for the given type, 
such as
-// "func(int)bool". If multiple func factories are registered for the same 
type,
+// RegisterFunc registers an custom Func factory for the given type, such as
+// "func(int)bool". If multiple Func factories are registered for the same 
type,
 // the last registration wins.
 func RegisterFunc(t reflect.Type, maker func(interface{}) Func) {
funcsMu.Lock()
@@ -59,7 +59,7 @@ func RegisterFunc(t reflect.Type, maker func(interface{}) 
Func) {
funcs[key] = maker
 }
 
-// MakeFunc returns a reflectFunc for given function.
+// MakeFunc returns a Func for given function.
 func MakeFunc(fn interface{}) Func {
funcsMu.Lock()
maker, exists := funcs[reflect.TypeOf(fn).String()]

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" .


[jira] [Assigned] (BEAM-3484) HadoopInputFormatIO reads big datasets invalid

2018-01-19 Thread Chamikara Jayalath (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-3484:


Assignee: Ismaël Mejía  (was: Chamikara Jayalath)

> HadoopInputFormatIO reads big datasets invalid
> --
>
> Key: BEAM-3484
> URL: https://issues.apache.org/jira/browse/BEAM-3484
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Łukasz Gajowy
>Assignee: Ismaël Mejía
>Priority: Major
> Attachments: result_sorted100, result_sorted60
>
>
> For big datasets HadoopInputFormat sometimes skips/duplicates elements from 
> database in resulting PCollection. This gives incorrect read result.
> Occurred to me while developing HadoopInputFormatIOIT and running it on 
> dataflow. For datasets smaller or equal to 600 000 database rows I wasn't 
> able to reproduce the issue. Bug appeared only for bigger sets, eg. 700 000, 
> 1 000 000. 
> Attachments:
>   - text file with sorted HadoopInputFormat.read() result saved using 
> TextIO.write().to().withoutSharding(). If you look carefully you'll notice 
> duplicates or missing values that should not happen
>  - same text file for 600 000 records not having any duplicates and missing 
> elements
>  - link to a PR with HadoopInputFormatIO integration test that allows to 
> reproduce this issue. At the moment of writing, this code is not merged yet.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3323) Create a generator of finite-but-unbounded PCollection's for integration testing

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-3323:
-

Assignee: (was: Kenneth Knowles)

> Create a generator of finite-but-unbounded PCollection's for integration 
> testing
> 
>
> Key: BEAM-3323
> URL: https://issues.apache.org/jira/browse/BEAM-3323
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Eugene Kirpichov
>Priority: Major
>
> Several IOs have features that exhibit nontrivial behavior when writing 
> unbounded PCollection's - e.g. WriteFiles with windowed writes; BigQueryIO. 
> We need to be able to write integration tests for these features.
> Currently we have two ways to generate an unbounded PCollection without 
> reading from a real-world external streaming system such as pubsub or kafka:
> 1) TestStream, which only works in direct runner - sufficient for some tests 
> but not all: definitely not sufficient for large-scale tests or for tests 
> that need to interact with a real instance of the external system (e.g. 
> BigQueryIO). It is also quite verbose to use.
> 2) GenerateSequence.from(0) without a .to(), which returns an infinite amount 
> of data.
> GenerateSequence.from(a).to(b) returns a finite amount of data, but returns 
> it as a bounded PCollection, and doesn't report the watermark.
> I think the right thing to do here, for now, is to make 
> GenerateSequence.from(a).to(b) have an option (e.g. ".asUnbounded()", where 
> it will return an unbounded PCollection, go through UnboundedSource (or 
> potentially via SDF in runners that support it), and track the watermark 
> properly (or via a configurable watermark fn).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3165) Mongo document read with non hex objectid

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-3165:
--
Component/s: (was: sdk-java-core)
 sdk-java-extensions

> Mongo document read with non hex objectid
> -
>
> Key: BEAM-3165
> URL: https://issues.apache.org/jira/browse/BEAM-3165
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Affects Versions: 2.1.0
>Reporter: Utkarsh Sopan
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>
> I have a mongo collection which has non-hex '_id' in form a string.
> I cant read them into a PCollection getting following exception
> Exception in thread "main" java.lang.IllegalArgumentException: invalid 
> hexadecimal representation of an ObjectId: [somestring]
>   at org.bson.types.ObjectId.parseHexString(ObjectId.java:523)
>   at org.bson.types.ObjectId.(ObjectId.java:237)
>   at 
> org.bson.json.JsonReader.visitObjectIdConstructor(JsonReader.java:674)
>   at org.bson.json.JsonReader.readBsonType(JsonReader.java:197)
>   at org.bson.codecs.DocumentCodec.decode(DocumentCodec.java:139)
>   at org.bson.codecs.DocumentCodec.decode(DocumentCodec.java:45)
>   at org.bson.codecs.configuration.LazyCodec.decode(LazyCodec.java:47)
>   at org.bson.codecs.DocumentCodec.readValue(DocumentCodec.java:215)
>   at org.bson.codecs.DocumentCodec.decode(DocumentCodec.java:141)
>   at org.bson.codecs.DocumentCodec.decode(DocumentCodec.java:45)
>   at org.bson.codecs.DocumentCodec.readValue(DocumentCodec.java:215)
>   at org.bson.codecs.DocumentCodec.readList(DocumentCodec.java:222)
>   at org.bson.codecs.DocumentCodec.readValue(DocumentCodec.java:208)
>   at org.bson.codecs.DocumentCodec.decode(DocumentCodec.java:141)
>   at org.bson.codecs.DocumentCodec.decode(DocumentCodec.java:45)
>   at org.bson.Document.parse(Document.java:105)
>   at org.bson.Document.parse(Document.java:90)
>   at 
> org.apache.beam.sdk.io.mongodb.MongoDbIO$BoundedMongoDbReader.start(MongoDbIO.java:472)
>   at 
> org.apache.beam.runners.direct.BoundedReadEvaluatorFactory$BoundedReadEvaluator.processElement(BoundedReadEvaluatorFactory.java:141)
>   at 
> org.apache.beam.runners.direct.TransformExecutor.processElements(TransformExecutor.java:146)
>   at 
> org.apache.beam.runners.direct.TransformExecutor.run(TransformExecutor.java:110)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3165) Mongo document read with non hex objectid

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-3165:
-

Assignee: Jean-Baptiste Onofré  (was: Kenneth Knowles)

> Mongo document read with non hex objectid
> -
>
> Key: BEAM-3165
> URL: https://issues.apache.org/jira/browse/BEAM-3165
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.1.0
>Reporter: Utkarsh Sopan
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>
> I have a mongo collection which has non-hex '_id' in form a string.
> I cant read them into a PCollection getting following exception
> Exception in thread "main" java.lang.IllegalArgumentException: invalid 
> hexadecimal representation of an ObjectId: [somestring]
>   at org.bson.types.ObjectId.parseHexString(ObjectId.java:523)
>   at org.bson.types.ObjectId.(ObjectId.java:237)
>   at 
> org.bson.json.JsonReader.visitObjectIdConstructor(JsonReader.java:674)
>   at org.bson.json.JsonReader.readBsonType(JsonReader.java:197)
>   at org.bson.codecs.DocumentCodec.decode(DocumentCodec.java:139)
>   at org.bson.codecs.DocumentCodec.decode(DocumentCodec.java:45)
>   at org.bson.codecs.configuration.LazyCodec.decode(LazyCodec.java:47)
>   at org.bson.codecs.DocumentCodec.readValue(DocumentCodec.java:215)
>   at org.bson.codecs.DocumentCodec.decode(DocumentCodec.java:141)
>   at org.bson.codecs.DocumentCodec.decode(DocumentCodec.java:45)
>   at org.bson.codecs.DocumentCodec.readValue(DocumentCodec.java:215)
>   at org.bson.codecs.DocumentCodec.readList(DocumentCodec.java:222)
>   at org.bson.codecs.DocumentCodec.readValue(DocumentCodec.java:208)
>   at org.bson.codecs.DocumentCodec.decode(DocumentCodec.java:141)
>   at org.bson.codecs.DocumentCodec.decode(DocumentCodec.java:45)
>   at org.bson.Document.parse(Document.java:105)
>   at org.bson.Document.parse(Document.java:90)
>   at 
> org.apache.beam.sdk.io.mongodb.MongoDbIO$BoundedMongoDbReader.start(MongoDbIO.java:472)
>   at 
> org.apache.beam.runners.direct.BoundedReadEvaluatorFactory$BoundedReadEvaluator.processElement(BoundedReadEvaluatorFactory.java:141)
>   at 
> org.apache.beam.runners.direct.TransformExecutor.processElements(TransformExecutor.java:146)
>   at 
> org.apache.beam.runners.direct.TransformExecutor.run(TransformExecutor.java:110)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3484) HadoopInputFormatIO reads big datasets invalid

2018-01-19 Thread Chamikara Jayalath (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16333005#comment-16333005
 ] 

Chamikara Jayalath commented on BEAM-3484:
--

[~iemejia] is this something you are interested in looking into ? Feel free to 
unassign if not.

cc: [~jbonofre]

> HadoopInputFormatIO reads big datasets invalid
> --
>
> Key: BEAM-3484
> URL: https://issues.apache.org/jira/browse/BEAM-3484
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Łukasz Gajowy
>Assignee: Chamikara Jayalath
>Priority: Major
> Attachments: result_sorted100, result_sorted60
>
>
> For big datasets HadoopInputFormat sometimes skips/duplicates elements from 
> database in resulting PCollection. This gives incorrect read result.
> Occurred to me while developing HadoopInputFormatIOIT and running it on 
> dataflow. For datasets smaller or equal to 600 000 database rows I wasn't 
> able to reproduce the issue. Bug appeared only for bigger sets, eg. 700 000, 
> 1 000 000. 
> Attachments:
>   - text file with sorted HadoopInputFormat.read() result saved using 
> TextIO.write().to().withoutSharding(). If you look carefully you'll notice 
> duplicates or missing values that should not happen
>  - same text file for 600 000 records not having any duplicates and missing 
> elements
>  - link to a PR with HadoopInputFormatIO integration test that allows to 
> reproduce this issue. At the moment of writing, this code is not merged yet.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-3049) Java SDK Harness bundles non-relocated code, including Dataflow runner

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles resolved BEAM-3049.
---
   Resolution: Fixed
Fix Version/s: 2.3.0

> Java SDK Harness bundles non-relocated code, including Dataflow runner
> --
>
> Key: BEAM-3049
> URL: https://issues.apache.org/jira/browse/BEAM-3049
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-harness
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: 2.3.0
>
>
> This causes a problem if something depends on the harness but does not want 
> the harness's copy of its dependencies. I know we intend to break the 
> dependency on the Dataflow runner. It also bundles a couple other things 
> unshaded.
> Mostly, the harness should be executed entirely containerized so it doesn't 
> matter, in which case there's no need to relocate anything, and bundling is 
> just a convenience. But we should have a clear policy that we adhere to. 
> Either it is a library and should have good hygeine, or if it doesn't have 
> good hygeine it must not be used as a library.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-2795) FlinkRunner: translate using SDK-agnostic means

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-2795:
--
Reporter: Kenneth Knowles  (was: Ben Sidhom)

> FlinkRunner: translate using SDK-agnostic means
> ---
>
> Key: BEAM-2795
> URL: https://issues.apache.org/jira/browse/BEAM-2795
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Labels: portability
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-2795) FlinkRunner: translate using SDK-agnostic means

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-2795:
-

Assignee: Ben Sidhom  (was: Kenneth Knowles)

> FlinkRunner: translate using SDK-agnostic means
> ---
>
> Key: BEAM-2795
> URL: https://issues.apache.org/jira/browse/BEAM-2795
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Kenneth Knowles
>Assignee: Ben Sidhom
>Priority: Major
>  Labels: portability
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3378) Using ValueProvider with FixedWindows and Duration

2018-01-19 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16333001#comment-16333001
 ] 

Kenneth Knowles commented on BEAM-3378:
---

It is a good request to consider. Thanks for filing it!

> Using ValueProvider with FixedWindows and Duration
> 
>
> Key: BEAM-3378
> URL: https://issues.apache.org/jira/browse/BEAM-3378
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Affects Versions: 2.2.0
>Reporter: Shushu Inbar
>Priority: Minor
>
> Moving toward Apache Beam 2.x, I want to use templates as much as possible, 
> and ValueProvider accordingly.
> In my logic, I am using a FixedWindow, but the duration is flexible, so I 
> rather get it from a ValueProvider.
> The problem is that FixedWindows.of() only get Duration, and I failed to find 
> a simple way to grab a ValueProvider and make a duration out of it.
> According to Eugene Kirpichov "Unfortunately this question has a simple 
> answer: this is currently not supported"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3262) Provide a withReadableByteChannelFactory on TextIO.Read

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-3262:
-

Assignee: Chamikara Jayalath  (was: Kenneth Knowles)

> Provide a withReadableByteChannelFactory on TextIO.Read
> ---
>
> Key: BEAM-3262
> URL: https://issues.apache.org/jira/browse/BEAM-3262
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Tobi Vollebregt
>Assignee: Chamikara Jayalath
>Priority: Minor
>
> Hi,
> I am prototyping a dataflow job that reads from GPG encrypted files. I got 
> this working, but had to jump through some hoops to supply a new 
> "compression" format to TextIO: I basically had to copy a good part of 
> TextIO.Read only to add a {{withDecompressingChannelFactory}} method similar 
> to {{withWritableByteChannelFactory}} in TextIO.Write.
> Would it be possible to add a method {{withDecompressingChannelFactory}} or 
> {{withReadableByteChannelFactory}} to TextIO.Read to allow users to plug in 
> custom decompression/decryption code?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3378) Using ValueProvider with FixedWindows and Duration

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-3378:
-

Assignee: (was: Kenneth Knowles)

> Using ValueProvider with FixedWindows and Duration
> 
>
> Key: BEAM-3378
> URL: https://issues.apache.org/jira/browse/BEAM-3378
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Affects Versions: 2.2.0
>Reporter: Shushu Inbar
>Priority: Minor
>
> Moving toward Apache Beam 2.x, I want to use templates as much as possible, 
> and ValueProvider accordingly.
> In my logic, I am using a FixedWindow, but the duration is flexible, so I 
> rather get it from a ValueProvider.
> The problem is that FixedWindows.of() only get Duration, and I failed to find 
> a simple way to grab a ValueProvider and make a duration out of it.
> According to Eugene Kirpichov "Unfortunately this question has a simple 
> answer: this is currently not supported"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3488) Reduce log noise coming from File sink

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-3488:
-

Assignee: Chamikara Jayalath  (was: Kenneth Knowles)

> Reduce log noise coming from File sink
> --
>
> Key: BEAM-3488
> URL: https://issues.apache.org/jira/browse/BEAM-3488
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Affects Versions: 2.2.0
>Reporter: Pawel Bartoszek
>Assignee: Chamikara Jayalath
>Priority: Minor
>
> After switching to Beam 2.2 I noticed that File sink related classes generate 
> lots of lines likes this:
> {code:java}
> 2018-01-16 01:37:37,080 INFO org.apache.beam.sdk.io.FileBasedSink - No output 
> files to write.
> 2018-01-16 01:37:37,104 INFO org.apache.beam.sdk.io.WriteFiles - Will 
> finalize 0 files{code}
>  
> I did some counts and it looks like this lines account for 82% of all lines 
> in the log yet not giving much information.
> I am happy to raise a PR to make "No output files to write." and "Will 
> finalize {} files" logged at DEBUG level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3501) BigQuery Partitioned table creation/write fails when destination has partition decorator

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-3501:
--
Component/s: (was: beam-model)
 sdk-java-gcp

> BigQuery Partitioned table creation/write fails when destination has 
> partition decorator
> 
>
> Key: BEAM-3501
> URL: https://issues.apache.org/jira/browse/BEAM-3501
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Darshan Mehta
>Assignee: Kenneth Knowles
>Priority: Major
>
> Following is the code that writes to BigQuery: 
> {code:java}
> BigQueryIO.writeTableRows()
>  .to(destination)
>  .withCreateDisposition(CREATE_IF_NEEDED)
>  .withWriteDisposition(WRITE_APPEND)
>  .withSchema(tableSchema)
>  .expand(tableRows);{code}
>  
> Here's the destination's implementation: 
> {code:java}
> public TableDestination apply(ValueInSingleWindow input) {
>  String partition = timestampExtractor.apply(input.getValue())
>  .toString(DateTimeFormat.forPattern("MMdd").withZoneUTC());
>  TableReference tableReference = new TableReference();
>  tableReference.setDatasetId(dataset);
>  tableReference.setProjectId(projectId);
>  tableReference.setTableId(String.format("%s_%s", table, partition));
>  log.debug("Will write to BigQuery table: %s", tableReference);
>  return new TableDestination(tableReference, null);
> }{code}
>  
> When the dataflow tries to write to this table, I see the following message:
> {code:java}
> "errors" : [ {
>  "domain" : "global",
>  "message" : "Cannot read partition information from a table that is not 
> partitioned: :.$19730522",
>  "reason" : "invalid"
>  } ]{code}
> So, it looks like it's not creating tables with partition in the first place? 
> Apache beam version : 2.2.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3501) BigQuery Partitioned table creation/write fails when destination has partition decorator

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-3501:
-

Assignee: Chamikara Jayalath  (was: Kenneth Knowles)

> BigQuery Partitioned table creation/write fails when destination has 
> partition decorator
> 
>
> Key: BEAM-3501
> URL: https://issues.apache.org/jira/browse/BEAM-3501
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Darshan Mehta
>Assignee: Chamikara Jayalath
>Priority: Major
>
> Following is the code that writes to BigQuery: 
> {code:java}
> BigQueryIO.writeTableRows()
>  .to(destination)
>  .withCreateDisposition(CREATE_IF_NEEDED)
>  .withWriteDisposition(WRITE_APPEND)
>  .withSchema(tableSchema)
>  .expand(tableRows);{code}
>  
> Here's the destination's implementation: 
> {code:java}
> public TableDestination apply(ValueInSingleWindow input) {
>  String partition = timestampExtractor.apply(input.getValue())
>  .toString(DateTimeFormat.forPattern("MMdd").withZoneUTC());
>  TableReference tableReference = new TableReference();
>  tableReference.setDatasetId(dataset);
>  tableReference.setProjectId(projectId);
>  tableReference.setTableId(String.format("%s_%s", table, partition));
>  log.debug("Will write to BigQuery table: %s", tableReference);
>  return new TableDestination(tableReference, null);
> }{code}
>  
> When the dataflow tries to write to this table, I see the following message:
> {code:java}
> "errors" : [ {
>  "domain" : "global",
>  "message" : "Cannot read partition information from a table that is not 
> partitioned: :.$19730522",
>  "reason" : "invalid"
>  } ]{code}
> So, it looks like it's not creating tables with partition in the first place? 
> Apache beam version : 2.2.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3484) HadoopInputFormatIO reads big datasets invalid

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-3484:
--
Component/s: (was: runner-dataflow)
 (was: beam-model)
 sdk-java-extensions

> HadoopInputFormatIO reads big datasets invalid
> --
>
> Key: BEAM-3484
> URL: https://issues.apache.org/jira/browse/BEAM-3484
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Łukasz Gajowy
>Assignee: Kenneth Knowles
>Priority: Major
> Attachments: result_sorted100, result_sorted60
>
>
> For big datasets HadoopInputFormat sometimes skips/duplicates elements from 
> database in resulting PCollection. This gives incorrect read result.
> Occurred to me while developing HadoopInputFormatIOIT and running it on 
> dataflow. For datasets smaller or equal to 600 000 database rows I wasn't 
> able to reproduce the issue. Bug appeared only for bigger sets, eg. 700 000, 
> 1 000 000. 
> Attachments:
>   - text file with sorted HadoopInputFormat.read() result saved using 
> TextIO.write().to().withoutSharding(). If you look carefully you'll notice 
> duplicates or missing values that should not happen
>  - same text file for 600 000 records not having any duplicates and missing 
> elements
>  - link to a PR with HadoopInputFormatIO integration test that allows to 
> reproduce this issue. At the moment of writing, this code is not merged yet.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3484) HadoopInputFormatIO reads big datasets invalid

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-3484:
-

Assignee: Chamikara Jayalath  (was: Kenneth Knowles)

> HadoopInputFormatIO reads big datasets invalid
> --
>
> Key: BEAM-3484
> URL: https://issues.apache.org/jira/browse/BEAM-3484
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Łukasz Gajowy
>Assignee: Chamikara Jayalath
>Priority: Major
> Attachments: result_sorted100, result_sorted60
>
>
> For big datasets HadoopInputFormat sometimes skips/duplicates elements from 
> database in resulting PCollection. This gives incorrect read result.
> Occurred to me while developing HadoopInputFormatIOIT and running it on 
> dataflow. For datasets smaller or equal to 600 000 database rows I wasn't 
> able to reproduce the issue. Bug appeared only for bigger sets, eg. 700 000, 
> 1 000 000. 
> Attachments:
>   - text file with sorted HadoopInputFormat.read() result saved using 
> TextIO.write().to().withoutSharding(). If you look carefully you'll notice 
> duplicates or missing values that should not happen
>  - same text file for 600 000 records not having any duplicates and missing 
> elements
>  - link to a PR with HadoopInputFormatIO integration test that allows to 
> reproduce this issue. At the moment of writing, this code is not merged yet.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3494) Snapshot state of aggregated data of apache beam project is not maintained in flink's checkpointing

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-3494:
-

Assignee: Aljoscha Krettek  (was: Kenneth Knowles)

> Snapshot state of aggregated data of apache beam project is not maintained in 
> flink's checkpointing 
> 
>
> Key: BEAM-3494
> URL: https://issues.apache.org/jira/browse/BEAM-3494
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: suganya
>Assignee: Aljoscha Krettek
>Priority: Major
>
> We have a beam project which consumes events from kafka,does a groupby in a 
> time window(5 mins),after window elapses it pushes the events to downstream 
> for merge.This project is deployed using flink ,we have enabled checkpointing 
> to recover from failed state.
> (windowsize: 5mins , checkpointingInterval: 5mins,state.backend: filesystem)
> Offsets from kafka get checkpointed every 5 
> mins(checkpointingInterval).Before finishing the entire DAG(groupBy and 
> merge) , events offsets are getting checkpointed.So incase of any restart 
> from task-manager ,new task gets started from last successful checkpoint ,but 
> we could'nt able to get the aggregated snapshot data(data from groupBy task) 
> from the persisted checkpoint.
> Able to retrieve the last successful checkpointed offset from kafka ,but 
> couldnt able to get last aggregated data till checkpointing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3494) Snapshot state of aggregated data of apache beam project is not maintained in flink's checkpointing

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-3494:
--
Component/s: (was: sdk-java-core)
 runner-flink

> Snapshot state of aggregated data of apache beam project is not maintained in 
> flink's checkpointing 
> 
>
> Key: BEAM-3494
> URL: https://issues.apache.org/jira/browse/BEAM-3494
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: suganya
>Assignee: Kenneth Knowles
>Priority: Major
>
> We have a beam project which consumes events from kafka,does a groupby in a 
> time window(5 mins),after window elapses it pushes the events to downstream 
> for merge.This project is deployed using flink ,we have enabled checkpointing 
> to recover from failed state.
> (windowsize: 5mins , checkpointingInterval: 5mins,state.backend: filesystem)
> Offsets from kafka get checkpointed every 5 
> mins(checkpointingInterval).Before finishing the entire DAG(groupBy and 
> merge) , events offsets are getting checkpointed.So incase of any restart 
> from task-manager ,new task gets started from last successful checkpoint ,but 
> we could'nt able to get the aggregated snapshot data(data from groupBy task) 
> from the persisted checkpoint.
> Able to retrieve the last successful checkpointed offset from kafka ,but 
> couldnt able to get last aggregated data till checkpointing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-1643) IO ITs: add validation of row content

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-1643:
-

Assignee: Chamikara Jayalath  (was: Kenneth Knowles)

> IO ITs: add validation of row content
> -
>
> Key: BEAM-1643
> URL: https://issues.apache.org/jira/browse/BEAM-1643
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Stephen Sisk
>Assignee: Chamikara Jayalath
>Priority: Major
>
> We'd like to validate the contents of what we read from data stores more 
> comprehensively than just counting the # of rows returned.  Hashing the rows 
> returned would be an easy way to do that. 
> Stores to convert: 
> Jdbc
> Elasticsearch



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3455) Request payload size exceeds the limit: 10485760 bytes

2018-01-19 Thread Chamikara Jayalath (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332985#comment-16332985
 ] 

Chamikara Jayalath commented on BEAM-3455:
--

This issue was already discussed in the mailing list.

[https://lists.apache.org/thread.html/a4346e5bbef22fc311e76ccd13568bd9a4e0b4e83a137fc1d63ed3b8@%3Cuser.beam.apache.org%3E]

Unais, do you need additional info regarding this ?

 

> Request payload size exceeds the limit: 10485760 bytes
> --
>
> Key: BEAM-3455
> URL: https://issues.apache.org/jira/browse/BEAM-3455
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model, build-system, runner-dataflow
>Reporter: Unais
>Assignee: Chamikara Jayalath
>Priority: Major
>
> I wrote a python dataflow job to read data from Bigquery and do some 
> transform and save the result as bq table..
> I tested with 8 days data it works fine - when I scaled to 180 days I’m 
> getting the below error
> ```"message": "Request payload size exceeds the limit: 10485760 bytes.",```
> ```pitools.base.py.exceptions.HttpError: HttpError accessing 
> :
>  response: <{'status': '400', 'content-length': '145', 'x-xss-protection': 
> '1; mode=block', 'x-content-type-options': 'nosniff', 'transfer-encoding': 
> 'chunked', 'vary': 'Origin, X-Origin, Referer', 'server': 'ESF', 
> '-content-encoding': 'gzip', 'cache-control': 'private', 'date': 'Wed, 10 Jan 
> 2018 22:49:32 GMT', 'x-frame-options': 'SAMEORIGIN', 'alt-svc': 'hq=":443"; 
> ma=2592000; quic=51303431; quic=51303339; quic=51303338; quic=51303337; 
> quic=51303335,quic=":443"; ma=2592000; v="41,39,38,37,35"', 'content-type': 
> 'application/json; charset=UTF-8'}>, content <{
> "error": {
> "code": 400,
> "message": "Request payload size exceeds the limit: 10485760 bytes.",
> "status": "INVALID_ARGUMENT"
> }
> ```
> In short, this is what I’m doing
> 1 - Reading data from bigquery table using
> ```beam.io.BigQuerySource ```
> 2 - Partitioning each days using
> ``` beam.Partition ```
> 3- Applying transforms each partition and combining some output P-Collections.
> 4- After the transforms, the results are saved to a biqquery date partitioned 
> table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-2535) Allow explicit output time independent of firing specification for all timers

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-2535:
-

Assignee: Batkhuyag Batsaikhan  (was: Kenneth Knowles)

> Allow explicit output time independent of firing specification for all timers
> -
>
> Key: BEAM-2535
> URL: https://issues.apache.org/jira/browse/BEAM-2535
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Batkhuyag Batsaikhan
>Priority: Major
>
> Today, we have insufficient control over the event time timestamp of elements 
> output from a timer callback.
> 1. For an event time timer, it is the timestamp of the timer itself.
> 2. For a processing time timer, it is the current input watermark at the time 
> of processing.
> But for both of these, we may want to reserve the right to output a 
> particular time, aka set a "watermark hold".
> A naive implementation of a {{TimerWithWatermarkHold}} would work for making 
> sure output is not droppable, but does not fully explain window expiration 
> and late data/timer dropping.
> In the natural interpretation of a timer as a feedback loop on a transform, 
> timers should be viewed as another channel of input, with a watermark, and 
> items on that channel _all need event time timestamps even if they are 
> delivered according to a different time domain_.
> I propose that the specification for when a timer should fire should be 
> separated (with nice defaults) from the specification of the event time of 
> resulting outputs. These timestamps will determine a side channel with a new 
> "timer watermark" that constrains the output watermark.
>  - We still need to fire event time timers according to the input watermark, 
> so that event time timers fire.
>  - Late data dropping and window expiration will be in terms of the minimum 
> of the input watermark and the timer watermark. In this way, whenever a timer 
> is set, the window is not going to be garbage collected.
>  - We will need to make sure we have a way to "wake up" a window once it is 
> expired; this may be as simple as exhausting the timer channel as soon as the 
> input watermark indicates expiration of a window
> This is mostly aimed at end-user timers in a stateful+timely {{DoFn}}. It 
> seems reasonable to use timers as an implementation detail (e.g. in 
> runners-core utilities) without wanting any of this additional machinery. For 
> example, if there is no possibility of output from the timer callback.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3455) Request payload size exceeds the limit: 10485760 bytes

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-3455:
-

Assignee: Chamikara Jayalath  (was: Kenneth Knowles)

> Request payload size exceeds the limit: 10485760 bytes
> --
>
> Key: BEAM-3455
> URL: https://issues.apache.org/jira/browse/BEAM-3455
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model, build-system, runner-dataflow
>Reporter: Unais
>Assignee: Chamikara Jayalath
>Priority: Major
>
> I wrote a python dataflow job to read data from Bigquery and do some 
> transform and save the result as bq table..
> I tested with 8 days data it works fine - when I scaled to 180 days I’m 
> getting the below error
> ```"message": "Request payload size exceeds the limit: 10485760 bytes.",```
> ```pitools.base.py.exceptions.HttpError: HttpError accessing 
> :
>  response: <{'status': '400', 'content-length': '145', 'x-xss-protection': 
> '1; mode=block', 'x-content-type-options': 'nosniff', 'transfer-encoding': 
> 'chunked', 'vary': 'Origin, X-Origin, Referer', 'server': 'ESF', 
> '-content-encoding': 'gzip', 'cache-control': 'private', 'date': 'Wed, 10 Jan 
> 2018 22:49:32 GMT', 'x-frame-options': 'SAMEORIGIN', 'alt-svc': 'hq=":443"; 
> ma=2592000; quic=51303431; quic=51303339; quic=51303338; quic=51303337; 
> quic=51303335,quic=":443"; ma=2592000; v="41,39,38,37,35"', 'content-type': 
> 'application/json; charset=UTF-8'}>, content <{
> "error": {
> "code": 400,
> "message": "Request payload size exceeds the limit: 10485760 bytes.",
> "status": "INVALID_ARGUMENT"
> }
> ```
> In short, this is what I’m doing
> 1 - Reading data from bigquery table using
> ```beam.io.BigQuerySource ```
> 2 - Partitioning each days using
> ``` beam.Partition ```
> 3- Applying transforms each partition and combining some output P-Collections.
> 4- After the transforms, the results are saved to a biqquery date partitioned 
> table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3225) Non deterministic behaviour of AfterProcessingTime trigger with multiple group by transformations

2018-01-19 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332980#comment-16332980
 ] 

Kenneth Knowles commented on BEAM-3225:
---

Eugene has the most context on this, as he recently really dug into the issue 
of continuation triggers, triggers that terminate, and file sinks.

> Non deterministic behaviour of AfterProcessingTime trigger with multiple 
> group by transformations
> -
>
> Key: BEAM-3225
> URL: https://issues.apache.org/jira/browse/BEAM-3225
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core, runner-flink
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Pawel Bartoszek
>Assignee: Eugene Kirpichov
>Priority: Critical
>
> *Context*
> I run my [test 
> code|https://gist.github.com/pbartoszek/fe6b1d7501333a2a4b385a1424f6bd80] 
> against different triggers and runners. My original problem was that when 
> writing to a file sink files weren't always produced in a deterministic way. 
> Please refer to this 
> [BEAM-3151|https://issues.apache.org/jira/browse/BEAM-3151] When I started 
> looking at WriteFiles class I noticed that file sink implementation includes 
> some multiple GroupByKey transformations. Knowing that I wrote my test code 
> that is using multiple GroupByKey transformations to conclude that this is a 
> bit buggy(?) support of After(Synchronised)ProcessingTime triggers by 
> GroupByKey that also influence the file sink. When I run my job using 
> Dataflow runner I was getting expected output.
> *About test code*
> The job is counting how many A and B elements it received within 30 sec 
> windows using Count.perElement. Then I am using GroupByKey to fire every time 
> count has increased.
> Below I outlined the expected standard output: 
> {code:java}
> Let's assume all events are received in the same 30 seconds window.
> A -> After count KV{A, 1} -> Final group by KV{A, [1]}
> A -> After count KV{A, 2} -> Final group by KV{A, [1,2]}
> A -> After count KV{A, 3} -> Final group by KV{A, [1,2,3]}
> B -> After count KV{B, 1} -> Final group by KV{B, [1]}
> With my trigger configuration I would expect that for every new element 
> 'After count' is printed with new value followed by 'Final group by' with new 
> counter. Final group by represents the history of counters then.{code}
>  
> *Problem*
> 'Final group by' trigger doesn't always go off although trigger set up would 
> suggest that. This behaviour is different for different runners and Beam 
> versions. 
>  
> *My observations when using Pubsub*
> Trigger configuration
> {code:java}
> Window.into(FixedWindows.of(standardSeconds(30)))
> 
> .triggering(Repeatedly.forever(AfterProcessingTime.pastFirstElementInPane())) 
>  
> .withAllowedLateness(standardSeconds(5), 
> Window.ClosingBehavior.FIRE_IF_NON_EMPTY)
> .accumulatingFiredPanes())
> {code}
>  
>  Beam 2.0 Flink Runner
> {code:java}
> 2017-11-16T14:51:44.294Z After count KV{A, 1} 
> [2017-11-16T14:51:30.000Z..2017-11-16T14:52:00.000Z)
> 2017-11-16T14:51:53.036Z Received Element A 
> [2017-11-16T14:51:30.000Z..2017-11-16T14:52:00.000Z)
> 2017-11-16T14:51:53.143Z After count KV{A, 2} 
> [2017-11-16T14:51:30.000Z..2017-11-16T14:52:00.000Z)
> 2017-11-16T14:51:53.246Z Final group by KV{A, [1, 2]} 
> [2017-11-16T14:51:30.000Z..2017-11-16T14:52:00.000Z)
> 2017-11-16T14:52:03.522Z Received Element A 
> [2017-11-16T14:52:00.000Z..2017-11-16T14:52:30.000Z)
> 2017-11-16T14:52:03.629Z After count KV{A, 1} 
> [2017-11-16T14:52:00.000Z..2017-11-16T14:52:30.000Z)
> 2017-11-16T14:52:03.732Z Final group by KV{A, [1]} 
> [2017-11-16T14:52:00.000Z..2017-11-16T14:52:30.000Z)
> 2017-11-16T14:52:07.270Z Received Element A 
> [2017-11-16T14:52:00.000Z..2017-11-16T14:52:30.000Z)
> 2017-11-16T14:52:07.372Z After count KV{A, 2} 
> [2017-11-16T14:52:00.000Z..2017-11-16T14:52:30.000Z)
> 2017-11-16T14:52:07.476Z Final group by KV{A, [1, 2]} 
> [2017-11-16T14:52:00.000Z..2017-11-16T14:52:30.000Z)
> 2017-11-16T14:52:10.394Z Received Element A 
> [2017-11-16T14:52:00.000Z..2017-11-16T14:52:30.000Z)
> 2017-11-16T14:52:10.501Z After count KV{A, 3} 
> [2017-11-16T14:52:00.000Z..2017-11-16T14:52:30.000Z)
> 2017-11-16T14:52:10.602Z Final group by KV{A, [1, 2, 3]} 
> [2017-11-16T14:52:00.000Z..2017-11-16T14:52:30.000Z)
> 2017-11-16T14:52:13.296Z Received Element A 
> [2017-11-16T14:52:00.000Z..2017-11-16T14:52:30.000Z)
> 2017-11-16T14:52:13.402Z After count KV{A, 4} 
> [2017-11-16T14:52:00.000Z..2017-11-16T14:52:30.000Z)
> 2017-11-16T14:52:13.507Z Final group by KV{A, [1, 2, 3, 4]} 
> [2017-11-16T14:52:00.000Z..2017-11-16T14:52:30.000Z)
> 2017-11-16T14:52:14.951Z Received Element A 
> 

[jira] [Commented] (BEAM-3225) Non deterministic behaviour of AfterProcessingTime trigger with multiple group by transformations

2018-01-19 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332979#comment-16332979
 ] 

Kenneth Knowles commented on BEAM-3225:
---

I haven't read the whole trace you posted yet, but I want to highlight this: 
Only the DataflowRunner and DirectRunner support the "synchronized" part of 
synchronized processing time. And we basically intend to drop it. We've had 
some discussions about dangerous behavior that it causes. Essentially, if you 
choose a repeating processing time trigger, you should expect all your data out 
in some number of panes from the second GBK, but you can't be too picky about 
how many.

> Non deterministic behaviour of AfterProcessingTime trigger with multiple 
> group by transformations
> -
>
> Key: BEAM-3225
> URL: https://issues.apache.org/jira/browse/BEAM-3225
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core, runner-flink
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Pawel Bartoszek
>Assignee: Kenneth Knowles
>Priority: Critical
>
> *Context*
> I run my [test 
> code|https://gist.github.com/pbartoszek/fe6b1d7501333a2a4b385a1424f6bd80] 
> against different triggers and runners. My original problem was that when 
> writing to a file sink files weren't always produced in a deterministic way. 
> Please refer to this 
> [BEAM-3151|https://issues.apache.org/jira/browse/BEAM-3151] When I started 
> looking at WriteFiles class I noticed that file sink implementation includes 
> some multiple GroupByKey transformations. Knowing that I wrote my test code 
> that is using multiple GroupByKey transformations to conclude that this is a 
> bit buggy(?) support of After(Synchronised)ProcessingTime triggers by 
> GroupByKey that also influence the file sink. When I run my job using 
> Dataflow runner I was getting expected output.
> *About test code*
> The job is counting how many A and B elements it received within 30 sec 
> windows using Count.perElement. Then I am using GroupByKey to fire every time 
> count has increased.
> Below I outlined the expected standard output: 
> {code:java}
> Let's assume all events are received in the same 30 seconds window.
> A -> After count KV{A, 1} -> Final group by KV{A, [1]}
> A -> After count KV{A, 2} -> Final group by KV{A, [1,2]}
> A -> After count KV{A, 3} -> Final group by KV{A, [1,2,3]}
> B -> After count KV{B, 1} -> Final group by KV{B, [1]}
> With my trigger configuration I would expect that for every new element 
> 'After count' is printed with new value followed by 'Final group by' with new 
> counter. Final group by represents the history of counters then.{code}
>  
> *Problem*
> 'Final group by' trigger doesn't always go off although trigger set up would 
> suggest that. This behaviour is different for different runners and Beam 
> versions. 
>  
> *My observations when using Pubsub*
> Trigger configuration
> {code:java}
> Window.into(FixedWindows.of(standardSeconds(30)))
> 
> .triggering(Repeatedly.forever(AfterProcessingTime.pastFirstElementInPane())) 
>  
> .withAllowedLateness(standardSeconds(5), 
> Window.ClosingBehavior.FIRE_IF_NON_EMPTY)
> .accumulatingFiredPanes())
> {code}
>  
>  Beam 2.0 Flink Runner
> {code:java}
> 2017-11-16T14:51:44.294Z After count KV{A, 1} 
> [2017-11-16T14:51:30.000Z..2017-11-16T14:52:00.000Z)
> 2017-11-16T14:51:53.036Z Received Element A 
> [2017-11-16T14:51:30.000Z..2017-11-16T14:52:00.000Z)
> 2017-11-16T14:51:53.143Z After count KV{A, 2} 
> [2017-11-16T14:51:30.000Z..2017-11-16T14:52:00.000Z)
> 2017-11-16T14:51:53.246Z Final group by KV{A, [1, 2]} 
> [2017-11-16T14:51:30.000Z..2017-11-16T14:52:00.000Z)
> 2017-11-16T14:52:03.522Z Received Element A 
> [2017-11-16T14:52:00.000Z..2017-11-16T14:52:30.000Z)
> 2017-11-16T14:52:03.629Z After count KV{A, 1} 
> [2017-11-16T14:52:00.000Z..2017-11-16T14:52:30.000Z)
> 2017-11-16T14:52:03.732Z Final group by KV{A, [1]} 
> [2017-11-16T14:52:00.000Z..2017-11-16T14:52:30.000Z)
> 2017-11-16T14:52:07.270Z Received Element A 
> [2017-11-16T14:52:00.000Z..2017-11-16T14:52:30.000Z)
> 2017-11-16T14:52:07.372Z After count KV{A, 2} 
> [2017-11-16T14:52:00.000Z..2017-11-16T14:52:30.000Z)
> 2017-11-16T14:52:07.476Z Final group by KV{A, [1, 2]} 
> [2017-11-16T14:52:00.000Z..2017-11-16T14:52:30.000Z)
> 2017-11-16T14:52:10.394Z Received Element A 
> [2017-11-16T14:52:00.000Z..2017-11-16T14:52:30.000Z)
> 2017-11-16T14:52:10.501Z After count KV{A, 3} 
> [2017-11-16T14:52:00.000Z..2017-11-16T14:52:30.000Z)
> 2017-11-16T14:52:10.602Z Final group by KV{A, [1, 2, 3]} 
> [2017-11-16T14:52:00.000Z..2017-11-16T14:52:30.000Z)
> 2017-11-16T14:52:13.296Z Received Element A 
> 

Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #4756

2018-01-19 Thread Apache Jenkins Server
See 




[jira] [Updated] (BEAM-3219) DataflowRunner: @Setup not called for batch stateful DoFn

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-3219:
--
Fix Version/s: (was: 2.2.0)
   2.3.0

> DataflowRunner: @Setup not called for batch stateful DoFn
> -
>
> Key: BEAM-3219
> URL: https://issues.apache.org/jira/browse/BEAM-3219
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.0.0, 2.1.0, 2.2.0
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: 2.3.0
>
>
> Reported on u...@beam.apache.org: 
> https://lists.apache.org/thread.html/33c0038508308b69527dab5c3d27cf17ed65fdd9f5d7f6a17cf6d794@%3Cuser.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3219) DataflowRunner: @Setup not called for batch stateful DoFn

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-3219:
--
Affects Version/s: 2.2.0

> DataflowRunner: @Setup not called for batch stateful DoFn
> -
>
> Key: BEAM-3219
> URL: https://issues.apache.org/jira/browse/BEAM-3219
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.0.0, 2.1.0, 2.2.0
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: 2.3.0
>
>
> Reported on u...@beam.apache.org: 
> https://lists.apache.org/thread.html/33c0038508308b69527dab5c3d27cf17ed65fdd9f5d7f6a17cf6d794@%3Cuser.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-3052) ReduceFnRunner sets end-of-window hold even when no data is buffered

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles resolved BEAM-3052.
---
   Resolution: Fixed
Fix Version/s: 2.3.0

> ReduceFnRunner sets end-of-window hold even when no data is buffered
> 
>
> Key: BEAM-3052
> URL: https://issues.apache.org/jira/browse/BEAM-3052
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: 2.3.0
>
>
> If you set a trigger that ignores the end of the window (like repeated 
> processing time trigger) and an early firing emits all the data, the 
> ReduceFnRunner will leave an end-of-window watermark hold in place but only a 
> GC timer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3096) generic api support for Graph Computation like GraphX on Spark

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-3096:
-

Assignee: (was: Kenneth Knowles)

> generic api support for Graph Computation like GraphX on Spark
> --
>
> Key: BEAM-3096
> URL: https://issues.apache.org/jira/browse/BEAM-3096
> Project: Beam
>  Issue Type: Wish
>  Components: sdk-java-extensions
>Reporter: rayeaster
>Priority: Major
>  Labels: features
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> Is there any plan to add support for graph computation like GraphX on Spark?
> * graph representation in PCollection 
> * basic statistics like vertex/edge count
> * base function like vertex/edge-wise mapreduce task(i.e., count the outgoing 
> degree of a vertex)
> * base function like subgraph combine/join
> * ..



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-3081) Our findbugs config does not actually use Nullable annotations effectively

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles resolved BEAM-3081.
---
   Resolution: Fixed
Fix Version/s: Not applicable

I'm satisfied that enough precedent exists. We could take it further and test 
that every package-info.java has the proper annotation but it is diminishing 
returns at this point.

> Our findbugs config does not actually use Nullable annotations effectively
> --
>
> Key: BEAM-3081
> URL: https://issues.apache.org/jira/browse/BEAM-3081
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>
> We use {{@Nullable}} annotations mostly appropriately, but in fact our 
> findbugs config was not delivering value based on these annotations, because 
> it does not default to {{@NonNull}}. We can and should set this default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-3265) Gradle test output link is inaccessible URL

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles closed BEAM-3265.
-
   Resolution: Won't Fix
Fix Version/s: Not applicable

> Gradle test output link is inaccessible URL
> ---
>
> Key: BEAM-3265
> URL: https://issues.apache.org/jira/browse/BEAM-3265
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>
> When tests fail, you can get output like this:
> "There were failing tests. See the report at: 
> file:///home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_GradleBuild/src/sdks/java/io/kinesis/build/reports/tests/test/index.html"
> Of course, those URL is not something where a click will get you there. We 
> should improve the integration here in some way or another. Really, just 
> dumping the failure seems to be as good as any fancy plugin IMO.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3265) Gradle test output link is inaccessible URL

2018-01-19 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332970#comment-16332970
 ] 

Kenneth Knowles commented on BEAM-3265:
---

"You should just rely on the  and not expect to be able to see the "

It should be obvious that I filed this because the UI did not present enough or 
correct information. There was no actionable thing to do except re-run the 
whole build locally and hope to gain information. Missing or wrong info in the 
UI isn't a surprising or new situation - the build log and outputs are the 
source of truth and Jenkins puts a veneer on top, and the veneer has bugs and 
limitations.

I haven't hit a problem lately, so I will close this and just file another 
thing the next time I get blocked.

> Gradle test output link is inaccessible URL
> ---
>
> Key: BEAM-3265
> URL: https://issues.apache.org/jira/browse/BEAM-3265
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>
> When tests fail, you can get output like this:
> "There were failing tests. See the report at: 
> file:///home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_GradleBuild/src/sdks/java/io/kinesis/build/reports/tests/test/index.html"
> Of course, those URL is not something where a click will get you there. We 
> should improve the integration here in some way or another. Really, just 
> dumping the failure seems to be as good as any fancy plugin IMO.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to stable : beam_PostCommit_Java_MavenInstall #5706

2018-01-19 Thread Apache Jenkins Server
See 




Jenkins build became unstable: beam_PostCommit_Java_MavenInstall #5705

2018-01-19 Thread Apache Jenkins Server
See 




[jira] [Assigned] (BEAM-3152) AfterProcessingTime trigger doesn't create any file panes

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-3152:
-

Assignee: Eugene Kirpichov  (was: Kenneth Knowles)

> AfterProcessingTime trigger doesn't create any file panes
> -
>
> Key: BEAM-3152
> URL: https://issues.apache.org/jira/browse/BEAM-3152
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.0.0
>Reporter: Pawel Bartoszek
>Assignee: Eugene Kirpichov
>Priority: Major
>
> Scenario:
> I want to count how many events A and B I am getting for given 30 window. I 
> require that every pane has all events types (A and B) with corresponding 
> counters - this is why I am using Combine.globally.
> The calculation logic works fine the problem is with writing files. The files 
> are not written.
> For debugging purposes I created some transformations (Simulate 
> ApplyShardLabel, Simulate GroupIntoShards etc) that mimics that logic 
> implemented by WriteFiles.
> If you push string "A" and "B" to kinesis stream I am seeing the following 
> system.out from the job:
> {code:java}
> AFTER COMBINE: {A=1, B=1}
> {code}
> According to my test transformations I should also see:
> {code:java}
> AFTER COMBINE: {A=1, B=1}
> Simulating ApplyShardLabel
> Simulating finalizing writer: KV{null, [KV{0, [{A=1, B=1}]}]}
> {code}
> Using DirectRunner and Beam 2.0.0. When I switch to Beam 2.1.0 I see the 
> expected debug output and files being written out.
> I think that there is some issue with AfterSynchronizedProcessingTime trigger 
> support.
> I cannot replicate the issue when using `TestStream`
> The test code can be found at
> [https://gist.github.com/pbartoszek/9dd58c4fcfc5171eafba3520cb3040fa|https://gist.github.com/pbartoszek/9dd58c4fcfc5171eafba3520cb3040fa]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Spark #3942

2018-01-19 Thread Apache Jenkins Server
See 




[jira] [Assigned] (BEAM-3341) Event time timer not firing with TestStream

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-3341:
-

Assignee: Batkhuyag Batsaikhan  (was: Kenneth Knowles)

> Event time timer not firing with TestStream
> ---
>
> Key: BEAM-3341
> URL: https://issues.apache.org/jira/browse/BEAM-3341
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Reporter: Kenneth Knowles
>Assignee: Batkhuyag Batsaikhan
>Priority: Major
>
> https://github.com/andrewrjones/beam-test-stream-timer/blob/master/src/test/java/com/andrewjones/beam/TimerTest.java
> Unclear whether it is a DirectRunner problem or a core SDK problem. For now, 
> assuming it is DirectRunner.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3492) Spark Integration Tests fail with a Closed Connection

2018-01-19 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332929#comment-16332929
 ] 

Kenneth Knowles commented on BEAM-3492:
---

JB, did you have some idea on this? I cannot remember. Maybe we can find 
someone who might have investigated.

> Spark Integration Tests fail with a Closed Connection
> -
>
> Key: BEAM-3492
> URL: https://issues.apache.org/jira/browse/BEAM-3492
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Thomas Groh
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>
> Example: 
> [https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/16832]
>  
> 2018-01-17T23:52:25.668 [ERROR] 
> testE2EWordCount(org.apache.beam.examples.WordCountIT)  Time elapsed: 14.329 
> s  <<< ERROR!
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.io.IOException: 
> Connection from /127.0.0.1:45363 closed
>   at 
> org.apache.beam.runners.spark.SparkPipelineResult.beamExceptionFrom(SparkPipelineResult.java:68)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3492) Spark Integration Tests fail with a Closed Connection

2018-01-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-3492:
-

Assignee: Jean-Baptiste Onofré  (was: Amit Sela)

> Spark Integration Tests fail with a Closed Connection
> -
>
> Key: BEAM-3492
> URL: https://issues.apache.org/jira/browse/BEAM-3492
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Thomas Groh
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>
> Example: 
> [https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/16832]
>  
> 2018-01-17T23:52:25.668 [ERROR] 
> testE2EWordCount(org.apache.beam.examples.WordCountIT)  Time elapsed: 14.329 
> s  <<< ERROR!
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.io.IOException: 
> Connection from /127.0.0.1:45363 closed
>   at 
> org.apache.beam.runners.spark.SparkPipelineResult.beamExceptionFrom(SparkPipelineResult.java:68)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Spark #3941

2018-01-19 Thread Apache Jenkins Server
See 




[jira] [Assigned] (BEAM-3502) Avoid use of proto.Builder.clone() in DatastoreIO

2018-01-19 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik reassigned BEAM-3502:
---

Assignee: Larry Li  (was: Kenneth Knowles)

> Avoid use of proto.Builder.clone() in DatastoreIO
> -
>
> Key: BEAM-3502
> URL: https://issues.apache.org/jira/browse/BEAM-3502
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Affects Versions: 2.2.0
>Reporter: Larry Li
>Assignee: Larry Li
>Priority: Minor
> Fix For: 2.3.0
>
>
> DatastoreIO uses proto.Builder.clone() here:
> [https://github.com/apache/beam/blob/c0f0e1fd63ce1e9dfe1db71adf1c8b9e88ce7038/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/datastore/DatastoreV1.java#L893]
>  
> It's only used in one place for actual runtime code, but this causes 
> incompatibility problems with Google-internal Java proto generation, i.e. we 
> get a 'NoSuchMethodError' when attempting to run the pipeline with internal 
> build tools.
>  
> This is a known problem that's already been worked around once:
> https://issues.apache.org/jira/browse/BEAM-2392
> ..but the fix only applied to BigtableServiceImpl. This extends those changes 
> to DatastoreIO, replacing its single use of clone(). Associated tests 
> shouldn't need refactoring, as this only appears as a problem at runtime.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3492) Spark Integration Tests fail with a Closed Connection

2018-01-19 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332918#comment-16332918
 ] 

Kenneth Knowles commented on BEAM-3492:
---

This is quite common: 
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/org.apache.beam$beam-examples-java/16902/console

I notice that it does not happen in the gradle build. Perhaps it has to do with 
the one maven profile that includes all the runners. Since the maven config is 
known bad, we could just transition this particular signal completely to gradle 
and the problem is already fixed.

> Spark Integration Tests fail with a Closed Connection
> -
>
> Key: BEAM-3492
> URL: https://issues.apache.org/jira/browse/BEAM-3492
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Thomas Groh
>Assignee: Amit Sela
>Priority: Major
>
> Example: 
> [https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/16832]
>  
> 2018-01-17T23:52:25.668 [ERROR] 
> testE2EWordCount(org.apache.beam.examples.WordCountIT)  Time elapsed: 14.329 
> s  <<< ERROR!
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.io.IOException: 
> Connection from /127.0.0.1:45363 closed
>   at 
> org.apache.beam.runners.spark.SparkPipelineResult.beamExceptionFrom(SparkPipelineResult.java:68)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] branch master updated (e86ddb4 -> 6b6800c)

2018-01-19 Thread tgroh
This is an automated email from the ASF dual-hosted git repository.

tgroh pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from e86ddb4  Use platformThreadFactory for default thread pool.
 add cb8f7a2  Implement a GRPC Provision Service
 new 6b6800c  Merge pull request #4421

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../provisioning/StaticGrpcProvisionService.java   | 55 
 .../fnexecution/provisioning}/package-info.java|  6 +-
 .../StaticGrpcProvisionServiceTest.java| 97 ++
 3 files changed, 154 insertions(+), 4 deletions(-)
 create mode 100644 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/provisioning/StaticGrpcProvisionService.java
 copy 
{sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider
 => 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/provisioning}/package-info.java
 (90%)
 create mode 100644 
runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/provisioning/StaticGrpcProvisionServiceTest.java

-- 
To stop receiving notification emails like this one, please contact
['"commits@beam.apache.org" '].


[beam] 01/01: Merge pull request #4421

2018-01-19 Thread tgroh
This is an automated email from the ASF dual-hosted git repository.

tgroh pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 6b6800cfeffb17f13fe5f386215c4eb53d235af8
Merge: e86ddb4 cb8f7a2
Author: Thomas Groh 
AuthorDate: Fri Jan 19 13:33:55 2018 -0800

Merge pull request #4421

Implement a GRPC Provision Service

 .../provisioning/StaticGrpcProvisionService.java   | 55 
 .../fnexecution/provisioning/package-info.java | 20 +
 .../StaticGrpcProvisionServiceTest.java| 97 ++
 3 files changed, 172 insertions(+)

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" .


[beam] branch master updated: Use platformThreadFactory for default thread pool.

2018-01-19 Thread lcwik
This is an automated email from the ASF dual-hosted git repository.

lcwik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
 new e86ddb4  Use platformThreadFactory for default thread pool.
e86ddb4 is described below

commit e86ddb4103e85c927e3f13a312862f161d5d3d60
Author: Ilya Figotin 
AuthorDate: Thu Jan 18 15:12:35 2018 -0800

Use platformThreadFactory for default thread pool.
---
 .../main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java   | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java
index 2d2c63a..5ba2b01 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java
@@ -106,7 +106,8 @@ class PackageUtil implements Closeable {
 
   public static PackageUtil withDefaultThreadPool() {
 return PackageUtil.withExecutorService(
-
MoreExecutors.listeningDecorator(Executors.newFixedThreadPool(DEFAULT_THREAD_POOL_SIZE)));
+
MoreExecutors.listeningDecorator(Executors.newFixedThreadPool(DEFAULT_THREAD_POOL_SIZE,
+MoreExecutors.platformThreadFactory(;
   }
 
   public static PackageUtil withExecutorService(ListeningExecutorService 
executorService) {

-- 
To stop receiving notification emails like this one, please contact
['"commits@beam.apache.org" '].


[jira] [Updated] (BEAM-3503) PubsubIO - DynamicDestinations

2018-01-19 Thread Nalseez Duke (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nalseez Duke updated BEAM-3503:
---
Affects Version/s: 2.2.0

> PubsubIO - DynamicDestinations
> --
>
> Key: BEAM-3503
> URL: https://issues.apache.org/jira/browse/BEAM-3503
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions, sdk-java-gcp
>Affects Versions: 2.2.0
>Reporter: Nalseez Duke
>Assignee: Reuven Lax
>Priority: Minor
>
> PubsubIO does not support the "DynamicDestinations" notion that is currently 
> implemented for File-based I/O and BigQueryIO.
> It would be nice if PubsubIO could also support this functionality - the 
> ability to write to a Pub/Sub topic dynamically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3503) PubsubIO - DynamicDestinations

2018-01-19 Thread Nalseez Duke (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nalseez Duke updated BEAM-3503:
---
Affects Version/s: (was: 2.2.0)

> PubsubIO - DynamicDestinations
> --
>
> Key: BEAM-3503
> URL: https://issues.apache.org/jira/browse/BEAM-3503
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions, sdk-java-gcp
>Reporter: Nalseez Duke
>Assignee: Reuven Lax
>Priority: Minor
>
> PubsubIO does not support the "DynamicDestinations" notion that is currently 
> implemented for File-based I/O and BigQueryIO.
> It would be nice if PubsubIO could also support this functionality - the 
> ability to write to a Pub/Sub topic dynamically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #4755

2018-01-19 Thread Apache Jenkins Server
See 




[jira] [Created] (BEAM-3503) PubsubIO - DynamicDestinations

2018-01-19 Thread Nalseez Duke (JIRA)
Nalseez Duke created BEAM-3503:
--

 Summary: PubsubIO - DynamicDestinations
 Key: BEAM-3503
 URL: https://issues.apache.org/jira/browse/BEAM-3503
 Project: Beam
  Issue Type: New Feature
  Components: sdk-java-extensions, sdk-java-gcp
Affects Versions: 2.2.0
Reporter: Nalseez Duke
Assignee: Reuven Lax


PubsubIO does not support the "DynamicDestinations" notion that is currently 
implemented for File-based I/O and BigQueryIO.

 

It would be nice if PubsubIO could also support this functionality - the 
ability to write to a Pub/Sub topic dynamically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3503) PubsubIO - DynamicDestinations

2018-01-19 Thread Nalseez Duke (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nalseez Duke updated BEAM-3503:
---
Description: 
PubsubIO does not support the "DynamicDestinations" notion that is currently 
implemented for File-based I/O and BigQueryIO.

It would be nice if PubsubIO could also support this functionality - the 
ability to write to a Pub/Sub topic dynamically.

  was:
PubsubIO does not support the "DynamicDestinations" notion that is currently 
implemented for File-based I/O and BigQueryIO.

 

It would be nice if PubsubIO could also support this functionality - the 
ability to write to a Pub/Sub topic dynamically.


> PubsubIO - DynamicDestinations
> --
>
> Key: BEAM-3503
> URL: https://issues.apache.org/jira/browse/BEAM-3503
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions, sdk-java-gcp
>Affects Versions: 2.2.0
>Reporter: Nalseez Duke
>Assignee: Reuven Lax
>Priority: Minor
>
> PubsubIO does not support the "DynamicDestinations" notion that is currently 
> implemented for File-based I/O and BigQueryIO.
> It would be nice if PubsubIO could also support this functionality - the 
> ability to write to a Pub/Sub topic dynamically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-3502) Avoid use of proto.Builder.clone() in DatastoreIO

2018-01-19 Thread Larry Li (JIRA)
Larry Li created BEAM-3502:
--

 Summary: Avoid use of proto.Builder.clone() in DatastoreIO
 Key: BEAM-3502
 URL: https://issues.apache.org/jira/browse/BEAM-3502
 Project: Beam
  Issue Type: Improvement
  Components: beam-model
Affects Versions: 2.2.0
Reporter: Larry Li
Assignee: Kenneth Knowles
 Fix For: 2.3.0


DatastoreIO uses proto.Builder.clone() here:

[https://github.com/apache/beam/blob/c0f0e1fd63ce1e9dfe1db71adf1c8b9e88ce7038/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/datastore/DatastoreV1.java#L893]

 

It's only used in one place for actual runtime code, but this causes 
incompatibility problems with Google-internal Java proto generation, i.e. we 
get a 'NoSuchMethodError' when attempting to run the pipeline with internal 
build tools.

 

This is a known problem that's already been worked around once:

https://issues.apache.org/jira/browse/BEAM-2392

..but the fix only applied to BigtableServiceImpl. This extends those changes 
to DatastoreIO, replacing its single use of clone(). Associated tests shouldn't 
need refactoring, as this only appears as a problem at runtime.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Spark #3940

2018-01-19 Thread Apache Jenkins Server
See 




Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Dataflow #4754

2018-01-19 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2141) beam_PerformanceTests_JDBC have not passed in weeks

2018-01-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332783#comment-16332783
 ] 

ASF GitHub Bot commented on BEAM-2141:
--

chamikaramj closed pull request #3668: [BEAM-2141] Updates jenkins job for 
JDBCIOIT
URL: https://github.com/apache/beam/pull/3668
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/.test-infra/jenkins/common_job_properties.groovy 
b/.test-infra/jenkins/common_job_properties.groovy
index 2930d741608..7d0f49b362e 100644
--- a/.test-infra/jenkins/common_job_properties.groovy
+++ b/.test-infra/jenkins/common_job_properties.groovy
@@ -254,20 +254,26 @@ class common_job_properties {
 return mapToArgString(joinedArgs)
   }
 
+  private static def buildPerfKit(def context) {
+context.steps {
+  // Clean up environment.
+  shell('rm -rf PerfKitBenchmarker')
+  // Clone appropriate perfkit branch
+  shell('git clone 
https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git')
+  // Install Perfkit benchmark requirements.
+  shell('pip install --user -r PerfKitBenchmarker/requirements.txt')
+  // Install job requirements for Python SDK.
+  shell('pip install --user -e sdks/python/[gcp,test]')
+}
+  }
+
   // Adds the standard performance test job steps.
   static def buildPerformanceTest(def context, def argMap) {
 def pkbArgs = genPerformanceArgs(argMap)
+buildPerfKit(context)
 context.steps {
-// Clean up environment.
-shell('rm -rf PerfKitBenchmarker')
-// Clone appropriate perfkit branch
-shell('git clone 
https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git')
-// Install Perfkit benchmark requirements.
-shell('pip install --user -r PerfKitBenchmarker/requirements.txt')
-// Install job requirements for Python SDK.
-shell('pip install --user -e sdks/python/[gcp,test]')
-// Launch performance test.
-shell("python PerfKitBenchmarker/pkb.py $pkbArgs")
+  // Launch performance test.
+  shell("python PerfKitBenchmarker/pkb.py $pkbArgs")
 }
   }
 
diff --git a/.test-infra/jenkins/job_beam_PerformanceTests_JDBC.groovy 
b/.test-infra/jenkins/job_beam_PerformanceTests_JDBC.groovy
index ef73a261b0c..6b0abe91fba 100644
--- a/.test-infra/jenkins/job_beam_PerformanceTests_JDBC.groovy
+++ b/.test-infra/jenkins/job_beam_PerformanceTests_JDBC.groovy
@@ -32,32 +32,59 @@ job('beam_PerformanceTests_JDBC'){
 'commits@beam.apache.org',
 false)
 
-def pipelineArgs = [
-tempRoot: 'gs://temp-storage-for-end-to-end-tests',
-project: 'apache-beam-testing',
-postgresServerName: '10.36.0.11',
-postgresUsername: 'postgres',
-postgresDatabaseName: 'postgres',
-postgresPassword: 'uuinkks',
-postgresSsl: 'false'
+common_job_properties.buildPerfKit(delegate)
+
+// Allows triggering this build against pull requests.
+common_job_properties.enablePhraseTriggeringFromPullRequest(
+delegate,
+'JDBC Performance Test',
+'Run JDBC Performance Test')
+
+clean_install_command = [
+'/home/jenkins/tools/maven/latest/bin/mvn',
+'-B',
+'-e',
+"-Pdataflow-runner",
+'clean',
+'install',
+// TODO: remove following since otherwise build could break due to 
changes to other mvn projects
+"-pl runners/google-cloud-dataflow-java",
+'-DskipTests'
 ]
-def pipelineArgList = []
-pipelineArgs.each({
-key, value -> pipelineArgList.add("--$key=$value")
-})
-def pipelineArgsJoined = pipelineArgList.join(',')
-
-def argMap = [
-  benchmarks: 'beam_integration_benchmark',
-  beam_it_module: 'sdks/java/io/jdbc',
-  beam_it_args: pipelineArgsJoined,
-  beam_it_class: 'org.apache.beam.sdk.io.jdbc.JdbcIOIT',
-  // Profile is located in $BEAM_ROOT/sdks/java/io/pom.xml.
-  beam_it_profile: 'io-it'
+
+io_it_suite_command = [
+'/home/jenkins/tools/maven/latest/bin/mvn',
+'-B',
+'-e',
+'verify',
+'-pl sdks/java/io/jdbc',
+'-Dio-it-suite',
+'-DpkbLocation="$WORKSPACE/PerfKitBenchmarker/pkb.py"',
+'-DmvnBinary=/home/jenkins/tools/maven/latest/bin/mvn',
+'-Dkubectl=/usr/lib/google-cloud-sdk/bin/kubectl',
+//'-Dkubeconfig=, # TODO(chamikara): should we set this
+'-DintegrationTestPipelineOptions=\'[ 
"--project=apache-beam-testing", 
"--tempRoot=gs://temp-storage-for-end-to-end-tests" ]\''
 ]
 
-

[jira] [Commented] (BEAM-3412) Update BigTable client version to 1.0

2018-01-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332782#comment-16332782
 ] 

ASF GitHub Bot commented on BEAM-3412:
--

chamikaramj closed pull request #4347: [BEAM-3412] Updates BigTable client 
version to 1.0
URL: https://github.com/apache/beam/pull/4347
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/pom.xml b/pom.xml
index 987a995fd71..d9378442964 100644
--- a/pom.xml
+++ b/pom.xml
@@ -110,7 +110,7 @@
 2.33
 1.8.2
 v2-rev355-1.22.0
-1.0.0-pre3
+1.0.0
 v1-rev6-1.22.0
 0.1.18
 v2-rev8-1.22.0


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Update BigTable client version to 1.0
> -
>
> Key: BEAM-3412
> URL: https://issues.apache.org/jira/browse/BEAM-3412
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-gcp
>Reporter: Chamikara Jayalath
>Assignee: Solomon Duskis
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Spark #3939

2018-01-19 Thread Apache Jenkins Server
See 




[jira] [Updated] (BEAM-3493) Prevent users from "implementing" PipelineOptions

2018-01-19 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-3493:

Priority: Minor  (was: Major)

> Prevent users from "implementing" PipelineOptions
> -
>
> Key: BEAM-3493
> URL: https://issues.apache.org/jira/browse/BEAM-3493
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Minor
>  Labels: newbie, starter
>
> I've seen a user implement \{{PipelineOptions}}. This implies that it is 
> backwards-incompatible to add new options, which is of course not our intent. 
> We should at least document very loudly that it is not to be implemented, and 
> preferably have some automation that will fail on load if they have 
> implemented it. Ideas?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3493) Prevent users from "implementing" PipelineOptions

2018-01-19 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332765#comment-16332765
 ] 

Luke Cwik commented on BEAM-3493:
-

We could just make sure that Pipeline.create(PipelineOptions) ensures that 
PipelineOptions is always a proxy class of ProxyInvocationHandler. This would 
prevent people from creating concrete implementations and any that are created 
would be ok to break since they wouldn't be able to function with the Apache 
Beam APIs.

> Prevent users from "implementing" PipelineOptions
> -
>
> Key: BEAM-3493
> URL: https://issues.apache.org/jira/browse/BEAM-3493
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Luke Cwik
>Priority: Major
>  Labels: newbie, starter
>
> I've seen a user implement \{{PipelineOptions}}. This implies that it is 
> backwards-incompatible to add new options, which is of course not our intent. 
> We should at least document very loudly that it is not to be implemented, and 
> preferably have some automation that will fail on load if they have 
> implemented it. Ideas?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3493) Prevent users from "implementing" PipelineOptions

2018-01-19 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-3493:

Labels: newbie starter  (was: )

> Prevent users from "implementing" PipelineOptions
> -
>
> Key: BEAM-3493
> URL: https://issues.apache.org/jira/browse/BEAM-3493
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Minor
>  Labels: newbie, starter
>
> I've seen a user implement \{{PipelineOptions}}. This implies that it is 
> backwards-incompatible to add new options, which is of course not our intent. 
> We should at least document very loudly that it is not to be implemented, and 
> preferably have some automation that will fail on load if they have 
> implemented it. Ideas?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3493) Prevent users from "implementing" PipelineOptions

2018-01-19 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik reassigned BEAM-3493:
---

Assignee: (was: Luke Cwik)

> Prevent users from "implementing" PipelineOptions
> -
>
> Key: BEAM-3493
> URL: https://issues.apache.org/jira/browse/BEAM-3493
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>  Labels: newbie, starter
>
> I've seen a user implement \{{PipelineOptions}}. This implies that it is 
> backwards-incompatible to add new options, which is of course not our intent. 
> We should at least document very loudly that it is not to be implemented, and 
> preferably have some automation that will fail on load if they have 
> implemented it. Ideas?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-413) Mean$CountSum tests for floating point equality

2018-01-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332742#comment-16332742
 ] 

ASF GitHub Bot commented on BEAM-413:
-

kennknowles closed pull request #4219: [BEAM-413] Created local annotation for 
floating point equality warning.
URL: https://github.com/apache/beam/pull/4219
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml 
b/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml
index 2035f854203..ca0989c657c 100644
--- a/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml
+++ b/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml
@@ -331,12 +331,6 @@
 
 
   
-  
-
-
-
-
-  
   
 
 
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Mean.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Mean.java
index 8932b03e527..69deddb276e 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Mean.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Mean.java
@@ -156,6 +156,10 @@ public Double extractOutput() {
 }
 
 @Override
+@edu.umd.cs.findbugs.annotations.SuppressWarnings(
+value = "FE_FLOATING_POINT_EQUALITY",
+justification = "Comparing doubles directly since equals method is 
only used in coder test."
+)
 public boolean equals(Object other) {
   if (!(other instanceof CountSum)) {
 return false;


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Mean$CountSum tests for floating point equality
> ---
>
> Key: BEAM-413
> URL: https://issues.apache.org/jira/browse/BEAM-413
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Batkhuyag Batsaikhan
>Priority: Minor
>  Labels: findbugs, newbie, starter
>
> [FindBugs 
> FE_FLOATING_POINT_EQUALITY|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml#L298]:
>  Test for floating point equality
> Applies to: 
> [Mean$CountSum.equals|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Mean.java#L165].
> This is a good starter bug. When fixing, please remove the corresponding 
> entries from 
> [findbugs-filter.xml|https://github.com/apache/incubator-beam/blob/master/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml]
>  and verify the build passes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to stable : beam_PostCommit_Java_MavenInstall #5701

2018-01-19 Thread Apache Jenkins Server
See 




[beam] branch master updated (fca18e5 -> edb8389)

2018-01-19 Thread kenn
This is an automated email from the ASF dual-hosted git repository.

kenn pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from fca18e5  Merge pull request #4263: [BEAM-3351] Fix Javadoc formatting 
issues
 add 1288c3b  Moved floating point equality findbugs annotation from 
generic xml file into the function that has the warning.
 new edb8389  Merge pull request #4219: [BEAM-413] Created local annotation 
for floating point equality warning.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml   | 6 --
 .../core/src/main/java/org/apache/beam/sdk/transforms/Mean.java | 4 
 2 files changed, 4 insertions(+), 6 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
['"commits@beam.apache.org" '].


  1   2   >