[GitHub] incubator-beam pull request #1368: [BEAM-498] Transmit new DoFn in Dataflow ...

2016-11-15 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/1368

[BEAM-498] Transmit new DoFn in Dataflow translator

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam 
DataflowRunner-DoFnInfo

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1368.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1368


commit 4afbb0ddc86886a4402e403db2cdfea8df966c76
Author: Kenneth Knowles 
Date:   2016-11-16T06:27:35Z

Transmit new DoFn, not OldDoFn, in Dataflow translator




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-498) Make DoFnWithContext the new DoFn

2016-11-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669602#comment-15669602
 ] 

ASF GitHub Bot commented on BEAM-498:
-

GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/1368

[BEAM-498] Transmit new DoFn in Dataflow translator

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam 
DataflowRunner-DoFnInfo

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1368.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1368


commit 4afbb0ddc86886a4402e403db2cdfea8df966c76
Author: Kenneth Knowles 
Date:   2016-11-16T06:27:35Z

Transmit new DoFn, not OldDoFn, in Dataflow translator




> Make DoFnWithContext the new DoFn
> -
>
> Key: BEAM-498
> URL: https://issues.apache.org/jira/browse/BEAM-498
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>  Labels: backward-incompatible
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/2] incubator-beam git commit: Connect generated DoFnInvoker.invokerOnTimer to OnTimerInvoker

2016-11-15 Thread kenn
Repository: incubator-beam
Updated Branches:
  refs/heads/master dbbd5e448 -> dc94dbdd7


Connect generated DoFnInvoker.invokerOnTimer to OnTimerInvoker


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/a945a025
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/a945a025
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/a945a025

Branch: refs/heads/master
Commit: a945a025301ca09a4cfc160302ef3914429dc15e
Parents: dbbd5e4
Author: Kenneth Knowles 
Authored: Tue Nov 1 21:23:48 2016 -0700
Committer: Kenneth Knowles 
Committed: Tue Nov 15 20:08:41 2016 -0800

--
 .../reflect/ByteBuddyDoFnInvokerFactory.java| 152 ++-
 .../reflect/ByteBuddyOnTimerInvokerFactory.java |  24 ++-
 .../sdk/transforms/reflect/DoFnInvoker.java |   4 +
 .../sdk/transforms/reflect/DoFnInvokers.java|   6 +
 .../sdk/transforms/reflect/OnTimerInvoker.java  |   2 +-
 .../transforms/reflect/DoFnInvokersTest.java| 137 -
 .../testhelper/DoFnInvokersTestHelper.java  | 137 +
 7 files changed, 415 insertions(+), 47 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/a945a025/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/ByteBuddyDoFnInvokerFactory.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/ByteBuddyDoFnInvokerFactory.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/ByteBuddyDoFnInvokerFactory.java
index c137255..bc6d8c9 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/ByteBuddyDoFnInvokerFactory.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/ByteBuddyDoFnInvokerFactory.java
@@ -23,6 +23,7 @@ import java.lang.reflect.Constructor;
 import java.lang.reflect.InvocationTargetException;
 import java.lang.reflect.Method;
 import java.util.ArrayList;
+import java.util.HashMap;
 import java.util.LinkedHashMap;
 import java.util.List;
 import java.util.Map;
@@ -31,7 +32,6 @@ import net.bytebuddy.ByteBuddy;
 import net.bytebuddy.NamingStrategy;
 import net.bytebuddy.description.field.FieldDescription;
 import net.bytebuddy.description.method.MethodDescription;
-import net.bytebuddy.description.modifier.FieldManifestation;
 import net.bytebuddy.description.modifier.Visibility;
 import net.bytebuddy.description.type.TypeDescription;
 import net.bytebuddy.description.type.TypeList;
@@ -65,6 +65,7 @@ import org.apache.beam.sdk.coders.Coder;
 import org.apache.beam.sdk.coders.CoderRegistry;
 import org.apache.beam.sdk.transforms.DoFn;
 import org.apache.beam.sdk.transforms.DoFn.ProcessElement;
+import org.apache.beam.sdk.transforms.reflect.DoFnSignature.OnTimerMethod;
 import org.apache.beam.sdk.transforms.reflect.DoFnSignature.Parameter.Cases;
 import 
org.apache.beam.sdk.transforms.reflect.DoFnSignature.Parameter.ContextParameter;
 import 
org.apache.beam.sdk.transforms.reflect.DoFnSignature.Parameter.InputProviderParameter;
@@ -128,6 +129,54 @@ public class ByteBuddyDoFnInvokerFactory implements 
DoFnInvokerFactory {
 return newByteBuddyInvoker(DoFnSignatures.getSignature((Class) 
fn.getClass()), fn);
   }
 
+  /**
+   * Internal base class for generated {@link DoFnInvoker} instances.
+   *
+   * This class should not be extended directly, or by Beam users. 
It must be public for
+   * generated instances to have adequate access, as they are generated 
"inside" the invoked {@link
+   * DoFn} class.
+   */
+  public abstract static class DoFnInvokerBase>
+  implements DoFnInvoker {
+protected DoFnT delegate;
+
+private Map onTimerInvokers = new HashMap<>();
+
+public DoFnInvokerBase(DoFnT delegate) {
+  this.delegate = delegate;
+}
+
+/**
+ * Associates the given timer ID with the given {@link OnTimerInvoker}.
+ *
+ * ByteBuddy does not like to generate conditional code, so we use a 
map + lookup
+ * of the timer ID rather than a generated conditional branch to choose 
which
+ * OnTimerInvoker to invoke.
+ *
+ * This method has package level access as it is intended only for 
assembly of the
+ * {@link DoFnInvokerBase} not by any subclass.
+ */
+void addOnTimerInvoker(String timerId, OnTimerInvoker onTimerInvoker) {
+  this.onTimerInvokers.put(timerId, onTimerInvoker);
+}
+
+@Override
+public void invokeOnTimer(String timerId, DoFn.ArgumentProvider arguments) {
+  @Nullable OnTimerInvoker onTimerInvoker = onTimerInvokers.get(timerId);
+
+  if 

[GitHub] incubator-beam pull request #1307: [BEAM-27] Connect generated DoFnInvoker's...

2016-11-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1307


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] incubator-beam git commit: This closes #1307

2016-11-15 Thread kenn
This closes #1307


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/dc94dbdd
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/dc94dbdd
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/dc94dbdd

Branch: refs/heads/master
Commit: dc94dbdd7b93a98f1dbe7a616f5134b95be4563c
Parents: dbbd5e4 a945a02
Author: Kenneth Knowles 
Authored: Tue Nov 15 21:18:31 2016 -0800
Committer: Kenneth Knowles 
Committed: Tue Nov 15 21:18:31 2016 -0800

--
 .../reflect/ByteBuddyDoFnInvokerFactory.java| 152 ++-
 .../reflect/ByteBuddyOnTimerInvokerFactory.java |  24 ++-
 .../sdk/transforms/reflect/DoFnInvoker.java |   4 +
 .../sdk/transforms/reflect/DoFnInvokers.java|   6 +
 .../sdk/transforms/reflect/OnTimerInvoker.java  |   2 +-
 .../transforms/reflect/DoFnInvokersTest.java| 137 -
 .../testhelper/DoFnInvokersTestHelper.java  | 137 +
 7 files changed, 415 insertions(+), 47 deletions(-)
--




[jira] [Commented] (BEAM-27) Add user-ready API for interacting with timers

2016-11-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-27?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669463#comment-15669463
 ] 

ASF GitHub Bot commented on BEAM-27:


Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1307


> Add user-ready API for interacting with timers
> --
>
> Key: BEAM-27
> URL: https://issues.apache.org/jira/browse/BEAM-27
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>
> Pipeline authors will benefit from a different factorization of interaction 
> with underlying timers. The current APIs are targeted at runner implementers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Jenkins build is back to normal : beam_PostCommit_MavenVerify #1830

2016-11-15 Thread Apache Jenkins Server
See 



[incubator-beam] Git Push Summary

2016-11-15 Thread kenn
Repository: incubator-beam
Updated Branches:
  refs/heads/apex-runner [deleted] 99001575d


[jira] [Commented] (BEAM-943) Implement Datastore query splitter for python

2016-11-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668812#comment-15668812
 ] 

ASF GitHub Bot commented on BEAM-943:
-

Github user vikkyrk closed the pull request at:

https://github.com/apache/incubator-beam/pull/1310


> Implement Datastore query splitter for python
> -
>
> Key: BEAM-943
> URL: https://issues.apache.org/jira/browse/BEAM-943
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py
>Reporter: Vikas Kedigehalli
>Assignee: Vikas Kedigehalli
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-983) Spark runner fails build on missing license and broken checkstyle.

2016-11-15 Thread Amit Sela (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Sela resolved BEAM-983.

   Resolution: Fixed
Fix Version/s: 0.4.0-incubating

> Spark runner fails build on missing license and broken checkstyle. 
> ---
>
> Key: BEAM-983
> URL: https://issues.apache.org/jira/browse/BEAM-983
> Project: Beam
>  Issue Type: Bug
>Reporter: Jason Kuster
>Assignee: Eugene Kirpichov
> Fix For: 0.4.0-incubating
>
>
> Pull request #1332 (https://github.com/apache/incubator-beam/pull/1332) is 
> failing in beam_PostCommit_MavenVerify with an Apache Rat failure -- 
> https://builds.apache.org/job/beam_PostCommit_MavenVerify/1827/ -- Rat output 
> points to 
> runners/spark/src/test/java/org/apache/beam/runners/spark/translation/streaming/utils/TestPipelineOptions.java
>  as the culprit.
> Also some unused imports and import order issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-983) Spark runner fails build on missing license and broken checkstyle.

2016-11-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668802#comment-15668802
 ] 

ASF GitHub Bot commented on BEAM-983:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1364


> Spark runner fails build on missing license and broken checkstyle. 
> ---
>
> Key: BEAM-983
> URL: https://issues.apache.org/jira/browse/BEAM-983
> Project: Beam
>  Issue Type: Bug
>Reporter: Jason Kuster
>Assignee: Eugene Kirpichov
> Fix For: 0.4.0-incubating
>
>
> Pull request #1332 (https://github.com/apache/incubator-beam/pull/1332) is 
> failing in beam_PostCommit_MavenVerify with an Apache Rat failure -- 
> https://builds.apache.org/job/beam_PostCommit_MavenVerify/1827/ -- Rat output 
> points to 
> runners/spark/src/test/java/org/apache/beam/runners/spark/translation/streaming/utils/TestPipelineOptions.java
>  as the culprit.
> Also some unused imports and import order issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/2] incubator-beam git commit: [BEAM-983] Fix a bunch of precommit errors from #1332

2016-11-15 Thread amitsela
Repository: incubator-beam
Updated Branches:
  refs/heads/master 201110222 -> dbbd5e448


[BEAM-983] Fix a bunch of precommit errors from #1332

Renames TestPipelineOptions to SparkTestPipelineOptions

To avoid confusion with sdk.testing.TestPipelineOptions.
Also, a couple of other minor fixes.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/dd740ee1
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/dd740ee1
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/dd740ee1

Branch: refs/heads/master
Commit: dd740ee1b20ab6921db3620ac28499dc66511482
Parents: 2011102
Author: Eugene Kirpichov 
Authored: Tue Nov 15 14:25:51 2016 -0800
Committer: Sela 
Committed: Wed Nov 16 01:49:59 2016 +0200

--
 .../runners/spark/ProvidedSparkContextTest.java |  2 -
 .../metrics/sink/NamedAggregatorsTest.java  |  4 +-
 .../beam/runners/spark/io/AvroPipelineTest.java |  5 +--
 .../beam/runners/spark/io/NumShardsTest.java|  5 +--
 .../io/hadoop/HadoopFileFormatPipelineTest.java |  5 +--
 .../spark/translation/SideEffectsTest.java  | 34 ++-
 .../streaming/EmptyStreamAssertionTest.java |  5 ++-
 .../streaming/FlattenStreamingTest.java |  5 ++-
 .../streaming/KafkaStreamingTest.java   |  5 ++-
 .../ResumeFromCheckpointStreamingTest.java  |  5 ++-
 .../streaming/SimpleStreamingWordCountTest.java |  5 ++-
 .../utils/SparkTestPipelineOptions.java | 42 +++
 .../SparkTestPipelineOptionsForStreaming.java   | 44 
 .../streaming/utils/TestPipelineOptions.java| 25 ---
 .../utils/TestPipelineOptionsForStreaming.java  | 44 
 15 files changed, 121 insertions(+), 114 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/dd740ee1/runners/spark/src/test/java/org/apache/beam/runners/spark/ProvidedSparkContextTest.java
--
diff --git 
a/runners/spark/src/test/java/org/apache/beam/runners/spark/ProvidedSparkContextTest.java
 
b/runners/spark/src/test/java/org/apache/beam/runners/spark/ProvidedSparkContextTest.java
index bc337c7..fe73aba 100644
--- 
a/runners/spark/src/test/java/org/apache/beam/runners/spark/ProvidedSparkContextTest.java
+++ 
b/runners/spark/src/test/java/org/apache/beam/runners/spark/ProvidedSparkContextTest.java
@@ -25,7 +25,6 @@ import java.util.Arrays;
 import java.util.List;
 import java.util.Set;
 import org.apache.beam.runners.spark.examples.WordCount;
-import 
org.apache.beam.runners.spark.translation.streaming.utils.TestPipelineOptions;
 import org.apache.beam.sdk.Pipeline;
 import org.apache.beam.sdk.coders.StringUtf8Coder;
 import org.apache.beam.sdk.options.PipelineOptionsFactory;
@@ -34,7 +33,6 @@ import org.apache.beam.sdk.transforms.Create;
 import org.apache.beam.sdk.transforms.MapElements;
 import org.apache.beam.sdk.values.PCollection;
 import org.apache.spark.api.java.JavaSparkContext;
-import org.junit.Rule;
 import org.junit.Test;
 
 /**

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/dd740ee1/runners/spark/src/test/java/org/apache/beam/runners/spark/aggregators/metrics/sink/NamedAggregatorsTest.java
--
diff --git 
a/runners/spark/src/test/java/org/apache/beam/runners/spark/aggregators/metrics/sink/NamedAggregatorsTest.java
 
b/runners/spark/src/test/java/org/apache/beam/runners/spark/aggregators/metrics/sink/NamedAggregatorsTest.java
index c220f2b..c16574c 100644
--- 
a/runners/spark/src/test/java/org/apache/beam/runners/spark/aggregators/metrics/sink/NamedAggregatorsTest.java
+++ 
b/runners/spark/src/test/java/org/apache/beam/runners/spark/aggregators/metrics/sink/NamedAggregatorsTest.java
@@ -28,7 +28,7 @@ import java.util.List;
 import java.util.Set;
 import org.apache.beam.runners.spark.SparkPipelineOptions;
 import org.apache.beam.runners.spark.examples.WordCount;
-import 
org.apache.beam.runners.spark.translation.streaming.utils.TestPipelineOptions;
+import 
org.apache.beam.runners.spark.translation.streaming.utils.SparkTestPipelineOptions;
 import org.apache.beam.sdk.Pipeline;
 import org.apache.beam.sdk.coders.StringUtf8Coder;
 import org.apache.beam.sdk.testing.PAssert;
@@ -52,7 +52,7 @@ public class NamedAggregatorsTest {
   public ClearAggregatorsRule clearAggregators = new ClearAggregatorsRule();
 
   @Rule
-  public final TestPipelineOptions pipelineOptions = new TestPipelineOptions();
+  public final SparkTestPipelineOptions pipelineOptions = new 
SparkTestPipelineOptions();
 
   private Pipeline createSparkPipeline() {
 SparkPipelineOptions options = pipelineOptions.getOptions();


[GitHub] incubator-beam pull request #1364: [BEAM-983] Fixes after #1332

2016-11-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1364


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-beam pull request #1367: Declare archetype starter dependency on s...

2016-11-15 Thread jkff
GitHub user jkff opened a pull request:

https://github.com/apache/incubator-beam/pull/1367

Declare archetype starter dependency on slf4j

Example failed log before this PR: 
https://api.travis-ci.org/jobs/176202650/log.txt?deansi=true

R: @lukecwik 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/incubator-beam declare-dependency

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1367.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1367


commit b6723eaa79b01727ebafa01aa3e72a113c1e369b
Author: Eugene Kirpichov 
Date:   2016-11-16T00:01:42Z

Declare archetype dependency on slf4j




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] incubator-beam git commit: This closes #1364

2016-11-15 Thread amitsela
This closes #1364


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/dbbd5e44
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/dbbd5e44
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/dbbd5e44

Branch: refs/heads/master
Commit: dbbd5e44800586c3f1a2efc58b91d79126c843d3
Parents: 2011102 dd740ee
Author: Sela 
Authored: Wed Nov 16 01:51:51 2016 +0200
Committer: Sela 
Committed: Wed Nov 16 01:51:51 2016 +0200

--
 .../runners/spark/ProvidedSparkContextTest.java |  2 -
 .../metrics/sink/NamedAggregatorsTest.java  |  4 +-
 .../beam/runners/spark/io/AvroPipelineTest.java |  5 +--
 .../beam/runners/spark/io/NumShardsTest.java|  5 +--
 .../io/hadoop/HadoopFileFormatPipelineTest.java |  5 +--
 .../spark/translation/SideEffectsTest.java  | 34 ++-
 .../streaming/EmptyStreamAssertionTest.java |  5 ++-
 .../streaming/FlattenStreamingTest.java |  5 ++-
 .../streaming/KafkaStreamingTest.java   |  5 ++-
 .../ResumeFromCheckpointStreamingTest.java  |  5 ++-
 .../streaming/SimpleStreamingWordCountTest.java |  5 ++-
 .../utils/SparkTestPipelineOptions.java | 42 +++
 .../SparkTestPipelineOptionsForStreaming.java   | 44 
 .../streaming/utils/TestPipelineOptions.java| 25 ---
 .../utils/TestPipelineOptionsForStreaming.java  | 44 
 15 files changed, 121 insertions(+), 114 deletions(-)
--




[jira] [Created] (BEAM-988) Support for testing how soon output is emitted

2016-11-15 Thread Eugene Kirpichov (JIRA)
Eugene Kirpichov created BEAM-988:
-

 Summary: Support for testing how soon output is emitted
 Key: BEAM-988
 URL: https://issues.apache.org/jira/browse/BEAM-988
 Project: Beam
  Issue Type: Bug
Reporter: Eugene Kirpichov
Assignee: Thomas Groh


I ran into this issue when testing Splittable DoFn. My intention is, it should 
behave exactly like a DoFn - i.e. emit output immediately when it receives 
input, regardless of the windowing/triggering strategy of the input (even 
though SDF has a GBK internally).

However, currently the SDK doesn't have facilities for testing that. TestStream 
allows controlling the timing of the input, but there's nothing to capture 
timing of the output. Moreover, timing of the output is unspecified by the 
model because triggers technically only enable firing, but do not force it 
(i.e. they are a lower bound on when output will be emitted).

I'm not sure what's the best way to address this. E.g., perhaps, PaneInfo could 
include a field "since when was this pane enabled to fire" (regardless of when 
it really fired)?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-987) TestStream.advanceWatermarkToInfinity should perhaps also advance processing time

2016-11-15 Thread Eugene Kirpichov (JIRA)
Eugene Kirpichov created BEAM-987:
-

 Summary: TestStream.advanceWatermarkToInfinity should perhaps also 
advance processing time
 Key: BEAM-987
 URL: https://issues.apache.org/jira/browse/BEAM-987
 Project: Beam
  Issue Type: Bug
Reporter: Eugene Kirpichov
Assignee: Thomas Groh


I ran into this when writing a test for Splittable DoFn whose input was a 
TestStream. I constructed a TestStream that didn't call advanceProcessingTime, 
and as a result, the SDF's timers didn't fire and the test got stuck.

I think the meaning of "advanceWatermarkToInfinity" is "don't add any more 
elements to the stream and see what happens eventually", and "eventually" 
includes "eventually in processing time domain", not just in event-time domain 
(watermark).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-986) ReduceFnRunner doesn't batch prefetching pane firings

2016-11-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668726#comment-15668726
 ] 

ASF GitHub Bot commented on BEAM-986:
-

GitHub user scwhittle opened a pull request:

https://github.com/apache/incubator-beam/pull/1366

[BEAM-986] Improve ReduceFnRunner prefetching

Hi @kennknowles , can you please take a look?

This refactoring ensures that for batches of elements or timers we have at 
most 2 fetches, one to prefetch the should fire bits and one to fetch the 
actual firing state.  The current behavior can end up with 2 fetches per window 
triggered.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/scwhittle/incubator-beam reduce_fn_prefetching

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1366.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1366


commit 597af025865c6ae2208fc94ea8864491cd84ff37
Author: Sam Whittle 
Date:   2016-11-10T20:59:49Z

Improvements to ReduceFnRunner prefetching:
- add prefetch* methods for prefetching state matching existing methods
- replace onTimer with batched onTimers method to allow prefetching
  across timers
- prefetch triggers in processElements

commit 8f11421019e2acd603546071ae0d8383143a33be
Author: Sam Whittle 
Date:   2016-11-15T22:55:38Z

fix annotation




> ReduceFnRunner doesn't batch prefetching pane firings
> -
>
> Key: BEAM-986
> URL: https://issues.apache.org/jira/browse/BEAM-986
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Sam Whittle
>Assignee: Sam Whittle
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Specifically
> - in ProcessElements, if there are multiple windows to consider each is 
> processed sequentially with sequential state fetches instead of a bulk 
> prefetch
> - onTimer method doesn't evaluate multiple timers at a time meaning that if 
> multiple timers are fired at once each is processed sequentially without 
> batched prefetching



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1366: [BEAM-986] Improve ReduceFnRunner prefetc...

2016-11-15 Thread scwhittle
GitHub user scwhittle opened a pull request:

https://github.com/apache/incubator-beam/pull/1366

[BEAM-986] Improve ReduceFnRunner prefetching

Hi @kennknowles , can you please take a look?

This refactoring ensures that for batches of elements or timers we have at 
most 2 fetches, one to prefetch the should fire bits and one to fetch the 
actual firing state.  The current behavior can end up with 2 fetches per window 
triggered.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/scwhittle/incubator-beam reduce_fn_prefetching

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1366.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1366


commit 597af025865c6ae2208fc94ea8864491cd84ff37
Author: Sam Whittle 
Date:   2016-11-10T20:59:49Z

Improvements to ReduceFnRunner prefetching:
- add prefetch* methods for prefetching state matching existing methods
- replace onTimer with batched onTimers method to allow prefetching
  across timers
- prefetch triggers in processElements

commit 8f11421019e2acd603546071ae0d8383143a33be
Author: Sam Whittle 
Date:   2016-11-15T22:55:38Z

fix annotation




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (BEAM-986) ReduceFnRunner doesn't batch prefetching pane firings

2016-11-15 Thread Sam Whittle (JIRA)
Sam Whittle created BEAM-986:


 Summary: ReduceFnRunner doesn't batch prefetching pane firings
 Key: BEAM-986
 URL: https://issues.apache.org/jira/browse/BEAM-986
 Project: Beam
  Issue Type: Bug
  Components: runner-core
Reporter: Sam Whittle
Assignee: Sam Whittle


Specifically
- in ProcessElements, if there are multiple windows to consider each is 
processed sequentially with sequential state fetches instead of a bulk prefetch
- onTimer method doesn't evaluate multiple timers at a time meaning that if 
multiple timers are fired at once each is processed sequentially without 
batched prefetching



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-983) Spark runner fails build on missing license and broken checkstyle.

2016-11-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668713#comment-15668713
 ] 

ASF GitHub Bot commented on BEAM-983:
-

Github user amitsela closed the pull request at:

https://github.com/apache/incubator-beam/pull/1362


> Spark runner fails build on missing license and broken checkstyle. 
> ---
>
> Key: BEAM-983
> URL: https://issues.apache.org/jira/browse/BEAM-983
> Project: Beam
>  Issue Type: Bug
>Reporter: Jason Kuster
>Assignee: Eugene Kirpichov
>
> Pull request #1332 (https://github.com/apache/incubator-beam/pull/1332) is 
> failing in beam_PostCommit_MavenVerify with an Apache Rat failure -- 
> https://builds.apache.org/job/beam_PostCommit_MavenVerify/1827/ -- Rat output 
> points to 
> runners/spark/src/test/java/org/apache/beam/runners/spark/translation/streaming/utils/TestPipelineOptions.java
>  as the culprit.
> Also some unused imports and import order issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1362: [BEAM-983] runners/spark/translation/stre...

2016-11-15 Thread amitsela
Github user amitsela closed the pull request at:

https://github.com/apache/incubator-beam/pull/1362


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-beam pull request #1365: Fix shared state across retry decorated f...

2016-11-15 Thread sb2nov
GitHub user sb2nov opened a pull request:

https://github.com/apache/incubator-beam/pull/1365

Fix shared state across retry decorated functions

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

The fix was fairly straight forward and just needed to move the generator 
inside the wrapped function.

R: @aaltay @chamikaramj  PTAL

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sb2nov/incubator-beam 
BEAM-985-fix-shared-state-in-retries

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1365.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1365


commit 5f865234dfbad47ae19a86959f92ccae7789c540
Author: Sourabh Bajaj 
Date:   2016-11-15T23:04:37Z

Fix shared state across retry decorated functions




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (BEAM-985) Retry decorator maintains state as uses an iterator

2016-11-15 Thread Sourabh Bajaj (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sourabh Bajaj updated BEAM-985:
---
Description: 
https://github.com/apache/incubator-beam/commit/57c30c752a524a40c7074ea69541964c77f22748

shows two unittests that fail due to state shared by the retry decorator 
iterator.

We have two options in fixing this:
1) Fix the retry decorator
2) Use an external package such as https://pypi.python.org/pypi/retrying


  was:
https://github.com/apache/incubator-beam/commit/2a207c928ee1edcfd32272abf12ed078d7daf26a

shows two unittests that fail due to state shared by the retry decorator 
iterator.

We have two options in fixing this:
1) Fix the retry decorator
2) Use an external package such as https://pypi.python.org/pypi/retrying



> Retry decorator maintains state as uses an iterator
> ---
>
> Key: BEAM-985
> URL: https://issues.apache.org/jira/browse/BEAM-985
> Project: Beam
>  Issue Type: Bug
>Reporter: Sourabh Bajaj
>Assignee: Sourabh Bajaj
> Fix For: 0.4.0-incubating
>
>
> https://github.com/apache/incubator-beam/commit/57c30c752a524a40c7074ea69541964c77f22748
> shows two unittests that fail due to state shared by the retry decorator 
> iterator.
> We have two options in fixing this:
> 1) Fix the retry decorator
> 2) Use an external package such as https://pypi.python.org/pypi/retrying



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-983) Spark runner fails build on missing license and broken checkstyle.

2016-11-15 Thread Amit Sela (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Sela updated BEAM-983:
---
Description: 
Pull request #1332 (https://github.com/apache/incubator-beam/pull/1332) is 
failing in beam_PostCommit_MavenVerify with an Apache Rat failure -- 
https://builds.apache.org/job/beam_PostCommit_MavenVerify/1827/ -- Rat output 
points to 
runners/spark/src/test/java/org/apache/beam/runners/spark/translation/streaming/utils/TestPipelineOptions.java
 as the culprit.

Also some unused imports and import order issues.

  was:Pull request #1332 (https://github.com/apache/incubator-beam/pull/1332) 
is failing in beam_PostCommit_MavenVerify with an Apache Rat failure -- 
https://builds.apache.org/job/beam_PostCommit_MavenVerify/1827/ -- Rat output 
points to 
runners/spark/src/test/java/org/apache/beam/runners/spark/translation/streaming/utils/TestPipelineOptions.java
 as the culprit.


> Spark runner fails build on missing license and broken checkstyle. 
> ---
>
> Key: BEAM-983
> URL: https://issues.apache.org/jira/browse/BEAM-983
> Project: Beam
>  Issue Type: Bug
>Reporter: Jason Kuster
>Assignee: Eugene Kirpichov
>
> Pull request #1332 (https://github.com/apache/incubator-beam/pull/1332) is 
> failing in beam_PostCommit_MavenVerify with an Apache Rat failure -- 
> https://builds.apache.org/job/beam_PostCommit_MavenVerify/1827/ -- Rat output 
> points to 
> runners/spark/src/test/java/org/apache/beam/runners/spark/translation/streaming/utils/TestPipelineOptions.java
>  as the culprit.
> Also some unused imports and import order issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-983) Spark runner fails build on missing license and broken checkstyle.

2016-11-15 Thread Amit Sela (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Sela updated BEAM-983:
---
Assignee: Eugene Kirpichov  (was: Amit Sela)

> Spark runner fails build on missing license and broken checkstyle. 
> ---
>
> Key: BEAM-983
> URL: https://issues.apache.org/jira/browse/BEAM-983
> Project: Beam
>  Issue Type: Bug
>Reporter: Jason Kuster
>Assignee: Eugene Kirpichov
>
> Pull request #1332 (https://github.com/apache/incubator-beam/pull/1332) is 
> failing in beam_PostCommit_MavenVerify with an Apache Rat failure -- 
> https://builds.apache.org/job/beam_PostCommit_MavenVerify/1827/ -- Rat output 
> points to 
> runners/spark/src/test/java/org/apache/beam/runners/spark/translation/streaming/utils/TestPipelineOptions.java
>  as the culprit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-983) Spark runner fails build on missing license and broken checkstyle.

2016-11-15 Thread Amit Sela (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Sela updated BEAM-983:
---
Summary: Spark runner fails build on missing license and broken checkstyle. 
  (was: runners/spark/translation/streaming/utils/TestPipelineOptions.java 
missing Apache license header.)

> Spark runner fails build on missing license and broken checkstyle. 
> ---
>
> Key: BEAM-983
> URL: https://issues.apache.org/jira/browse/BEAM-983
> Project: Beam
>  Issue Type: Bug
>Reporter: Jason Kuster
>Assignee: Amit Sela
>
> Pull request #1332 (https://github.com/apache/incubator-beam/pull/1332) is 
> failing in beam_PostCommit_MavenVerify with an Apache Rat failure -- 
> https://builds.apache.org/job/beam_PostCommit_MavenVerify/1827/ -- Rat output 
> points to 
> runners/spark/src/test/java/org/apache/beam/runners/spark/translation/streaming/utils/TestPipelineOptions.java
>  as the culprit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1364: Fixes after #1332

2016-11-15 Thread jkff
GitHub user jkff opened a pull request:

https://github.com/apache/incubator-beam/pull/1364

Fixes after #1332

#1332 introduced a few precommit failures. This PR fixes them and makes a 
couple of other minor changes noticed while doing the fixes.

R: @amitsela 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/incubator-beam fixup-1332

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1364.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1364


commit 97ae0b43d287a3715d43579f2001b0960e769a14
Author: Eugene Kirpichov 
Date:   2016-11-15T22:25:51Z

Fix a bunch of precommit errors from #1332

commit b19af6520312d52bf7fd39ecec7bc9d7a4a3febe
Author: Eugene Kirpichov 
Date:   2016-11-15T22:32:11Z

Renames TestPipelineOptions to SparkTestPipelineOptions

To avoid confusion with sdk.testing.TestPipelineOptions.
Also, a couple of other minor fixes.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] incubator-beam git commit: Query Splitter for Datastore v1

2016-11-15 Thread robertwb
Repository: incubator-beam
Updated Branches:
  refs/heads/python-sdk d1fccbf5e -> 21b7844bb


Query Splitter for Datastore v1


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/c1126b70
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/c1126b70
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/c1126b70

Branch: refs/heads/python-sdk
Commit: c1126b708469fc63bd8ab8e54026700408ec34da
Parents: d1fccbf
Author: Vikas Kedigehalli 
Authored: Mon Oct 24 18:29:29 2016 -0700
Committer: Robert Bradshaw 
Committed: Tue Nov 15 14:24:02 2016 -0800

--
 .../python/apache_beam/io/datastore/__init__.py |  16 ++
 .../apache_beam/io/datastore/v1/__init__.py |  16 ++
 .../apache_beam/io/datastore/v1/helper.py   |  84 ++
 .../apache_beam/io/datastore/v1/helper_test.py  | 124 +
 .../io/datastore/v1/query_splitter.py   | 270 +++
 .../io/datastore/v1/query_splitter_test.py  | 257 ++
 sdks/python/setup.py|   1 +
 7 files changed, 768 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/c1126b70/sdks/python/apache_beam/io/datastore/__init__.py
--
diff --git a/sdks/python/apache_beam/io/datastore/__init__.py 
b/sdks/python/apache_beam/io/datastore/__init__.py
new file mode 100644
index 000..cce3aca
--- /dev/null
+++ b/sdks/python/apache_beam/io/datastore/__init__.py
@@ -0,0 +1,16 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/c1126b70/sdks/python/apache_beam/io/datastore/v1/__init__.py
--
diff --git a/sdks/python/apache_beam/io/datastore/v1/__init__.py 
b/sdks/python/apache_beam/io/datastore/v1/__init__.py
new file mode 100644
index 000..cce3aca
--- /dev/null
+++ b/sdks/python/apache_beam/io/datastore/v1/__init__.py
@@ -0,0 +1,16 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/c1126b70/sdks/python/apache_beam/io/datastore/v1/helper.py
--
diff --git a/sdks/python/apache_beam/io/datastore/v1/helper.py 
b/sdks/python/apache_beam/io/datastore/v1/helper.py
new file mode 100644
index 000..626ab35
--- /dev/null
+++ b/sdks/python/apache_beam/io/datastore/v1/helper.py
@@ -0,0 +1,84 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing 

[2/2] incubator-beam git commit: Closes #1310

2016-11-15 Thread robertwb
Closes #1310


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/21b7844b
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/21b7844b
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/21b7844b

Branch: refs/heads/python-sdk
Commit: 21b7844bb05e9a86531876cffe8ee776bfaaa1cc
Parents: d1fccbf c1126b7
Author: Robert Bradshaw 
Authored: Tue Nov 15 14:24:03 2016 -0800
Committer: Robert Bradshaw 
Committed: Tue Nov 15 14:24:03 2016 -0800

--
 .../python/apache_beam/io/datastore/__init__.py |  16 ++
 .../apache_beam/io/datastore/v1/__init__.py |  16 ++
 .../apache_beam/io/datastore/v1/helper.py   |  84 ++
 .../apache_beam/io/datastore/v1/helper_test.py  | 124 +
 .../io/datastore/v1/query_splitter.py   | 270 +++
 .../io/datastore/v1/query_splitter_test.py  | 257 ++
 sdks/python/setup.py|   1 +
 7 files changed, 768 insertions(+)
--




[jira] [Created] (BEAM-985) Retry decorator maintains state as uses an iterator

2016-11-15 Thread Sourabh Bajaj (JIRA)
Sourabh Bajaj created BEAM-985:
--

 Summary: Retry decorator maintains state as uses an iterator
 Key: BEAM-985
 URL: https://issues.apache.org/jira/browse/BEAM-985
 Project: Beam
  Issue Type: Bug
Reporter: Sourabh Bajaj
Assignee: Sourabh Bajaj
 Fix For: 0.4.0-incubating


https://github.com/apache/incubator-beam/commit/2a207c928ee1edcfd32272abf12ed078d7daf26a

shows two unittests that fail due to state shared by the retry decorator 
iterator.

We have two options in fixing this:
1) Fix the retry decorator
2) Use an external package such as https://pypi.python.org/pypi/retrying




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: beam_PostCommit_MavenVerify #1828

2016-11-15 Thread Apache Jenkins Server
See 

Changes:

[klk] Rename DoFn.ExtraContextFactory to DoFn.ArgumentProvider

--
[EnvInject] - Mask passwords passed as build parameters.
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on beam1 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/incubator-beam.git # 
 > timeout=10
Fetching upstream changes from https://github.com/apache/incubator-beam.git
 > git --version # timeout=10
 > git -c core.askpass=true fetch --tags --progress 
 > https://github.com/apache/incubator-beam.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/master^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/master^{commit} # timeout=10
Checking out Revision 20111022256e641ad288c28d6953c15314eb6a0d 
(refs/remotes/origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 20111022256e641ad288c28d6953c15314eb6a0d
 > git rev-list 503f26f448ea9f46fcfcdd46e60cba80e4844e28 # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
Parsing POMs
Modules changed, recalculating dependency graph
Established TCP socket on 47315
maven32-agent.jar already up to date
maven32-interceptor.jar already up to date
maven3-interceptor-commons.jar already up to date
[beam_PostCommit_MavenVerify] $ /home/jenkins/tools/java/latest1.8/bin/java 
-Xmx2g -Xms256m -XX:MaxPermSize=512m -cp 
/home/jenkins/jenkins-slave/maven32-agent.jar:/home/jenkins/tools/maven/apache-maven-3.3.3/boot/plexus-classworlds-2.5.2.jar:/home/jenkins/tools/maven/apache-maven-3.3.3/conf/logging
 jenkins.maven3.agent.Maven32Main /home/jenkins/tools/maven/apache-maven-3.3.3 
/home/jenkins/jenkins-slave/slave.jar 
/home/jenkins/jenkins-slave/maven32-interceptor.jar 
/home/jenkins/jenkins-slave/maven3-interceptor-commons.jar 47315
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; 
support was removed in 8.0
<===[JENKINS REMOTING CAPACITY]===>   channel started
Executing Maven:  -B -f 
 
-Dmaven.repo.local=
 -B -e -P release,dataflow-runner -DrepoToken= clean install 
coveralls:report -DskipITs=false -DintegrationTestPipelineOptions=[ 
"--project=apache-beam-testing", 
"--tempRoot=gs://temp-storage-for-end-to-end-tests", 
"--runner=org.apache.beam.runners.dataflow.testing.TestDataflowRunner" ]
[INFO] Error stacktraces are turned on.
[INFO] Scanning for projects...
[INFO] 
[INFO] Reactor Build Order:
[INFO] 
[INFO] Apache Beam :: Parent
[INFO] Apache Beam :: SDKs :: Java :: Build Tools
[INFO] Apache Beam :: SDKs
[INFO] Apache Beam :: SDKs :: Java
[INFO] Apache Beam :: SDKs :: Java :: Core
[INFO] Apache Beam :: Runners
[INFO] Apache Beam :: Runners :: Core Java
[INFO] Apache Beam :: Runners :: Direct Java
[INFO] Apache Beam :: Runners :: Google Cloud Dataflow
[INFO] Apache Beam :: SDKs :: Java :: IO
[INFO] Apache Beam :: SDKs :: Java :: IO :: Google Cloud Platform
[INFO] Apache Beam :: SDKs :: Java :: IO :: HDFS
[INFO] Apache Beam :: SDKs :: Java :: IO :: JMS
[INFO] Apache Beam :: SDKs :: Java :: IO :: Kafka
[INFO] Apache Beam :: SDKs :: Java :: IO :: Kinesis
[INFO] Apache Beam :: SDKs :: Java :: IO :: MongoDB
[INFO] Apache Beam :: SDKs :: Java :: IO :: JDBC
[INFO] Apache Beam :: SDKs :: Java :: Maven Archetypes
[INFO] Apache Beam :: SDKs :: Java :: Maven Archetypes :: Starter
[INFO] Apache Beam :: SDKs :: Java :: Maven Archetypes :: Examples
[INFO] Apache Beam :: SDKs :: Java :: Extensions
[INFO] Apache Beam :: SDKs :: Java :: Extensions :: Join library
[INFO] Apache Beam :: SDKs :: Java :: Extensions :: Sorter
[INFO] Apache Beam :: SDKs :: Java :: Microbenchmarks
[INFO] Apache Beam :: SDKs :: Java :: Java 8 Tests
[INFO] Apache Beam :: Runners :: Flink
[INFO] Apache Beam :: Runners :: Flink :: Core
[INFO] Apache Beam :: Runners :: Flink :: Examples
[INFO] Apache Beam :: Runners :: Spark
[INFO] Apache Beam :: Runners :: Apex
[INFO] Apache Beam :: Examples
[INFO] Apache Beam :: Examples :: Java
[INFO] Apache Beam :: Examples :: Java 8
[WARNING] The POM for org.eclipse.m2e:lifecycle-mapping:jar:1.0.0 is missing, 
no dependency information available
[WARNING] Failed to retrieve plugin descriptor for 
org.eclipse.m2e:lifecycle-mapping:1.0.0: Plugin 
org.eclipse.m2e:lifecycle-mapping:1.0.0 or one of its dependencies could not be 
resolved: Failure to find 

[jira] [Commented] (BEAM-498) Make DoFnWithContext the new DoFn

2016-11-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668366#comment-15668366
 ] 

ASF GitHub Bot commented on BEAM-498:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1360


> Make DoFnWithContext the new DoFn
> -
>
> Key: BEAM-498
> URL: https://issues.apache.org/jira/browse/BEAM-498
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>  Labels: backward-incompatible
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1360: [BEAM-498] Rename DoFn.ExtraContextFactor...

2016-11-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1360


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] incubator-beam git commit: This closes #1360

2016-11-15 Thread kenn
Repository: incubator-beam
Updated Branches:
  refs/heads/master 503f26f44 -> 201110222


This closes #1360


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/20111022
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/20111022
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/20111022

Branch: refs/heads/master
Commit: 20111022256e641ad288c28d6953c15314eb6a0d
Parents: 503f26f 469c689
Author: Kenneth Knowles 
Authored: Tue Nov 15 13:11:40 2016 -0800
Committer: Kenneth Knowles 
Committed: Tue Nov 15 13:11:40 2016 -0800

--
 .../beam/runners/core/SimpleDoFnRunner.java |  4 +--
 .../beam/runners/core/SplittableParDo.java  | 16 ++
 .../org/apache/beam/sdk/transforms/DoFn.java|  8 ++---
 .../beam/sdk/transforms/DoFnAdapters.java   | 10 +++
 .../reflect/ByteBuddyDoFnInvokerFactory.java| 15 +-
 .../reflect/ByteBuddyOnTimerInvokerFactory.java |  4 +--
 .../sdk/transforms/reflect/DoFnInvoker.java |  2 +-
 .../sdk/transforms/reflect/DoFnInvokers.java|  3 +-
 .../sdk/transforms/reflect/OnTimerInvoker.java  |  2 +-
 .../transforms/reflect/DoFnInvokersTest.java| 31 ++--
 .../transforms/reflect/OnTimerInvokersTest.java |  7 ++---
 .../transforms/DoFnInvokersBenchmark.java   |  8 ++---
 12 files changed, 56 insertions(+), 54 deletions(-)
--




[2/2] incubator-beam git commit: Rename DoFn.ExtraContextFactory to DoFn.ArgumentProvider

2016-11-15 Thread kenn
Rename DoFn.ExtraContextFactory to DoFn.ArgumentProvider

The prior name started in the right place, but the role has gradually
morphed into a provider for all DoFn method arguments.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/469c689c
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/469c689c
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/469c689c

Branch: refs/heads/master
Commit: 469c689cc6bd0fe74658bf95b1e206cef3e0711d
Parents: 503f26f
Author: Kenneth Knowles 
Authored: Mon Nov 14 22:19:35 2016 -0800
Committer: Kenneth Knowles 
Committed: Tue Nov 15 13:11:40 2016 -0800

--
 .../beam/runners/core/SimpleDoFnRunner.java |  4 +--
 .../beam/runners/core/SplittableParDo.java  | 16 ++
 .../org/apache/beam/sdk/transforms/DoFn.java|  8 ++---
 .../beam/sdk/transforms/DoFnAdapters.java   | 10 +++
 .../reflect/ByteBuddyDoFnInvokerFactory.java| 15 +-
 .../reflect/ByteBuddyOnTimerInvokerFactory.java |  4 +--
 .../sdk/transforms/reflect/DoFnInvoker.java |  2 +-
 .../sdk/transforms/reflect/DoFnInvokers.java|  3 +-
 .../sdk/transforms/reflect/OnTimerInvoker.java  |  2 +-
 .../transforms/reflect/DoFnInvokersTest.java| 31 ++--
 .../transforms/reflect/OnTimerInvokersTest.java |  7 ++---
 .../transforms/DoFnInvokersBenchmark.java   |  8 ++---
 12 files changed, 56 insertions(+), 54 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/469c689c/runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java
--
diff --git 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java
 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java
index 1550303..c046d11 100644
--- 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java
+++ 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java
@@ -179,7 +179,7 @@ public class SimpleDoFnRunner implements 
DoFnRunner extends DoFn.Context
-  implements DoFn.ExtraContextFactory {
+  implements DoFn.ArgumentProvider {
 private static final int MAX_SIDE_OUTPUTS = 1000;
 
 final PipelineOptions options;
@@ -422,7 +422,7 @@ public class SimpleDoFnRunner implements 
DoFnRunner extends DoFn.ProcessContext
-  implements DoFn.ExtraContextFactory {
+  implements DoFn.ArgumentProvider {
 
 final DoFn fn;
 final DoFnContext context;

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/469c689c/runners/core-java/src/main/java/org/apache/beam/runners/core/SplittableParDo.java
--
diff --git 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/SplittableParDo.java
 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/SplittableParDo.java
index e344f92..3003984 100644
--- 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/SplittableParDo.java
+++ 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/SplittableParDo.java
@@ -391,19 +391,23 @@ public class SplittableParDo<
   };
 }
 
-/** Creates an {@link DoFn.ExtraContextFactory} that provides just the 
given tracker. */
-private DoFn.ExtraContextFactory wrapTracker(
+/**
+ * Creates an {@link DoFn.ArgumentProvider} that provides the given 
tracker as well as the given
+ * {@link ProcessContext} (which is also provided when a {@link Context} 
is requested.
+ */
+private DoFn.ArgumentProvider wrapTracker(
 TrackerT tracker, DoFn.ProcessContext processContext) 
{
-  return new ExtraContextFactoryForTracker<>(tracker, processContext);
+
+  return new ArgumentProviderForTracker<>(tracker, processContext);
 }
 
-private static class ExtraContextFactoryForTracker<
+private static class ArgumentProviderForTracker<
 InputT, OutputT, TrackerT extends RestrictionTracker>
-implements DoFn.ExtraContextFactory {
+implements DoFn.ArgumentProvider {
   private final TrackerT tracker;
   private final DoFn

[GitHub] incubator-beam pull request #1363: Remove Pipeline#getFullNameForTesting

2016-11-15 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/1363

Remove Pipeline#getFullNameForTesting

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam cleanup_pipeline

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1363.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1363


commit f7f51c0137c130d26520511bdf9eac9dde77248a
Author: Thomas Groh 
Date:   2016-11-15T20:38:02Z

Remove Pipeline#getFullNameForTesting




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-983) runners/spark/translation/streaming/utils/TestPipelineOptions.java missing Apache license header.

2016-11-15 Thread Amit Sela (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668220#comment-15668220
 ] 

Amit Sela commented on BEAM-983:


Not your fault. Jenkins didn't run in PreCommit - it's kind of lost it for now..
It should have run and fail on missing license, which personally happened to me 
many many times ;)

> runners/spark/translation/streaming/utils/TestPipelineOptions.java missing 
> Apache license header.
> -
>
> Key: BEAM-983
> URL: https://issues.apache.org/jira/browse/BEAM-983
> Project: Beam
>  Issue Type: Bug
>Reporter: Jason Kuster
>Assignee: Amit Sela
>
> Pull request #1332 (https://github.com/apache/incubator-beam/pull/1332) is 
> failing in beam_PostCommit_MavenVerify with an Apache Rat failure -- 
> https://builds.apache.org/job/beam_PostCommit_MavenVerify/1827/ -- Rat output 
> points to 
> runners/spark/src/test/java/org/apache/beam/runners/spark/translation/streaming/utils/TestPipelineOptions.java
>  as the culprit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-983) runners/spark/translation/streaming/utils/TestPipelineOptions.java missing Apache license header.

2016-11-15 Thread Stas Levin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668207#comment-15668207
 ] 

Stas Levin commented on BEAM-983:
-

[~jasonkuster] [~amitsela] Sorry about that, can these tests can be added to 
the set of tests that run against the GitHub PR?

> runners/spark/translation/streaming/utils/TestPipelineOptions.java missing 
> Apache license header.
> -
>
> Key: BEAM-983
> URL: https://issues.apache.org/jira/browse/BEAM-983
> Project: Beam
>  Issue Type: Bug
>Reporter: Jason Kuster
>Assignee: Amit Sela
>
> Pull request #1332 (https://github.com/apache/incubator-beam/pull/1332) is 
> failing in beam_PostCommit_MavenVerify with an Apache Rat failure -- 
> https://builds.apache.org/job/beam_PostCommit_MavenVerify/1827/ -- Rat output 
> points to 
> runners/spark/src/test/java/org/apache/beam/runners/spark/translation/streaming/utils/TestPipelineOptions.java
>  as the culprit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (BEAM-915) Update /get-started/releases doc

2016-11-15 Thread Hadar Hod (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hadar Hod closed BEAM-915.
--
   Resolution: Fixed
Fix Version/s: Not applicable

Page renamed to get-started/downloads

> Update /get-started/releases doc
> 
>
> Key: BEAM-915
> URL: https://issues.apache.org/jira/browse/BEAM-915
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Hadar Hod
>Assignee: Hadar Hod
> Fix For: Not applicable
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (BEAM-906) Fill in the get-started/downloads portion of the website

2016-11-15 Thread Hadar Hod (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hadar Hod reassigned BEAM-906:
--

Assignee: Hadar Hod

> Fill in the get-started/downloads portion of the website
> 
>
> Key: BEAM-906
> URL: https://issues.apache.org/jira/browse/BEAM-906
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Hadar Hod
>Assignee: Hadar Hod
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/3] incubator-beam git commit: Fix merge lint error

2016-11-15 Thread robertwb
Fix merge lint error


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/8bf25269
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/8bf25269
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/8bf25269

Branch: refs/heads/python-sdk
Commit: 8bf2526965dd319f654e7f995df940307fb2260f
Parents: 9d805ee
Author: Robert Bradshaw 
Authored: Tue Nov 15 11:06:27 2016 -0800
Committer: Robert Bradshaw 
Committed: Tue Nov 15 11:06:27 2016 -0800

--
 sdks/python/apache_beam/io/textio.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/8bf25269/sdks/python/apache_beam/io/textio.py
--
diff --git a/sdks/python/apache_beam/io/textio.py 
b/sdks/python/apache_beam/io/textio.py
index 4e94f87..9c89b68 100644
--- a/sdks/python/apache_beam/io/textio.py
+++ b/sdks/python/apache_beam/io/textio.py
@@ -244,7 +244,8 @@ class ReadFromText(PTransform):
 self._strip_trailing_newlines = strip_trailing_newlines
 self._coder = coder
 self._source = _TextSource(file_pattern, min_bundle_size, compression_type,
-   strip_trailing_newlines, coder, 
validate=validate)
+   strip_trailing_newlines, coder,
+   validate=validate)
 
   def apply(self, pvalue):
 return pvalue.pipeline | Read(self._source)



Jenkins build is back to normal : beam_PostCommit_PythonVerify #713

2016-11-15 Thread Apache Jenkins Server
See 



[1/3] incubator-beam git commit: Display Data for: PipelineOptions, combiners, more sources

2016-11-15 Thread robertwb
Repository: incubator-beam
Updated Branches:
  refs/heads/python-sdk 384fb5dc1 -> d1fccbf5e


Display Data for: PipelineOptions, combiners, more sources


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/9d805eec
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/9d805eec
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/9d805eec

Branch: refs/heads/python-sdk
Commit: 9d805eec6b9cedb43b6e79e255483fc8fa6832d1
Parents: 384fb5d
Author: Pablo 
Authored: Wed Nov 9 14:03:03 2016 -0800
Committer: Robert Bradshaw 
Committed: Tue Nov 15 11:02:28 2016 -0800

--
 sdks/python/apache_beam/internal/apiclient.py   | 18 +++--
 .../apache_beam/internal/json_value_test.py |  8 +-
 sdks/python/apache_beam/io/avroio.py| 20 -
 sdks/python/apache_beam/io/avroio_test.py   | 78 
 sdks/python/apache_beam/io/fileio.py| 20 -
 sdks/python/apache_beam/io/fileio_test.py   | 40 --
 sdks/python/apache_beam/io/iobase.py|  5 +-
 sdks/python/apache_beam/io/textio.py| 25 +--
 sdks/python/apache_beam/io/textio_test.py   | 28 +++
 sdks/python/apache_beam/pipeline_test.py| 12 +--
 sdks/python/apache_beam/transforms/combiners.py | 14 
 .../apache_beam/transforms/combiners_test.py| 63 
 sdks/python/apache_beam/transforms/core.py  | 16 +++-
 .../python/apache_beam/transforms/ptransform.py |  9 +++
 sdks/python/apache_beam/utils/options.py| 17 -
 .../apache_beam/utils/pipeline_options_test.py  | 46 ++--
 sdks/python/setup.py|  3 +
 17 files changed, 375 insertions(+), 47 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/9d805eec/sdks/python/apache_beam/internal/apiclient.py
--
diff --git a/sdks/python/apache_beam/internal/apiclient.py 
b/sdks/python/apache_beam/internal/apiclient.py
index 5ac9d6e..8992ec3 100644
--- a/sdks/python/apache_beam/internal/apiclient.py
+++ b/sdks/python/apache_beam/internal/apiclient.py
@@ -32,6 +32,7 @@ from apache_beam import utils
 from apache_beam.internal.auth import get_service_credentials
 from apache_beam.internal.json_value import to_json_value
 from apache_beam.transforms import cy_combiners
+from apache_beam.transforms.display import DisplayData
 from apache_beam.utils import dependency
 from apache_beam.utils import retry
 from apache_beam.utils.dependency import get_required_container_version
@@ -234,11 +235,18 @@ class Environment(object):
   self.proto.sdkPipelineOptions = (
   dataflow.Environment.SdkPipelineOptionsValue())
 
-  for k, v in sdk_pipeline_options.iteritems():
-if v is not None:
-  self.proto.sdkPipelineOptions.additionalProperties.append(
-  dataflow.Environment.SdkPipelineOptionsValue.AdditionalProperty(
-  key=k, value=to_json_value(v)))
+  options_dict = {k: v
+  for k, v in sdk_pipeline_options.iteritems()
+  if v is not None}
+  self.proto.sdkPipelineOptions.additionalProperties.append(
+  dataflow.Environment.SdkPipelineOptionsValue.AdditionalProperty(
+  key='options', value=to_json_value(options_dict)))
+
+  dd = DisplayData.create_from(options)
+  items = [item.get_dict() for item in dd.items]
+  self.proto.sdkPipelineOptions.additionalProperties.append(
+  dataflow.Environment.SdkPipelineOptionsValue.AdditionalProperty(
+  key='display_data', value=to_json_value(items)))
 
 
 class Job(object):

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/9d805eec/sdks/python/apache_beam/internal/json_value_test.py
--
diff --git a/sdks/python/apache_beam/internal/json_value_test.py 
b/sdks/python/apache_beam/internal/json_value_test.py
index cfab293..a4a47b8 100644
--- a/sdks/python/apache_beam/internal/json_value_test.py
+++ b/sdks/python/apache_beam/internal/json_value_test.py
@@ -76,14 +76,8 @@ class JsonValueTest(unittest.TestCase):
 self.assertEquals(long(27), from_json_value(to_json_value(long(27
 
   def test_too_long_value(self):
-try:
+with self.assertRaises(TypeError):
   to_json_value(long(1 << 64))
-except TypeError as e:
-  pass
-except Exception as e:
-  self.fail('Unexpected exception raised: {}'.format(e))
-else:
-  self.fail('TypeError not raised.')
 
 
 if __name__ == '__main__':

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/9d805eec/sdks/python/apache_beam/io/avroio.py

[3/3] incubator-beam git commit: Closes #1264

2016-11-15 Thread robertwb
Closes #1264


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/d1fccbf5
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/d1fccbf5
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/d1fccbf5

Branch: refs/heads/python-sdk
Commit: d1fccbf5eb5064a2a1a6831bb523f8cdf705c8d8
Parents: 384fb5d 8bf2526
Author: Robert Bradshaw 
Authored: Tue Nov 15 11:06:50 2016 -0800
Committer: Robert Bradshaw 
Committed: Tue Nov 15 11:06:50 2016 -0800

--
 sdks/python/apache_beam/internal/apiclient.py   | 18 +++--
 .../apache_beam/internal/json_value_test.py |  8 +-
 sdks/python/apache_beam/io/avroio.py| 20 -
 sdks/python/apache_beam/io/avroio_test.py   | 78 
 sdks/python/apache_beam/io/fileio.py| 20 -
 sdks/python/apache_beam/io/fileio_test.py   | 40 --
 sdks/python/apache_beam/io/iobase.py|  5 +-
 sdks/python/apache_beam/io/textio.py| 26 +--
 sdks/python/apache_beam/io/textio_test.py   | 28 +++
 sdks/python/apache_beam/pipeline_test.py| 12 +--
 sdks/python/apache_beam/transforms/combiners.py | 14 
 .../apache_beam/transforms/combiners_test.py| 63 
 sdks/python/apache_beam/transforms/core.py  | 16 +++-
 .../python/apache_beam/transforms/ptransform.py |  9 +++
 sdks/python/apache_beam/utils/options.py| 17 -
 .../apache_beam/utils/pipeline_options_test.py  | 46 ++--
 sdks/python/setup.py|  3 +
 17 files changed, 376 insertions(+), 47 deletions(-)
--




Build failed in Jenkins: beam_PostCommit_PythonVerify #712

2016-11-15 Thread Apache Jenkins Server
See 

Changes:

[robertwb] Add a couple of missing coder tests.

[robertwb] Also check coder determinism.

--
[...truncated 733 lines...]
test_default_typed_hint 
(apache_beam.typehints.typed_pipeline_test.SideInputTest) ... ok
test_default_untyped_hint 
(apache_beam.typehints.typed_pipeline_test.SideInputTest) ... ok
test_deferred_side_input_iterable 
(apache_beam.typehints.typed_pipeline_test.SideInputTest) ... ok
test_deferred_side_inputs 
(apache_beam.typehints.typed_pipeline_test.SideInputTest) ... ok
test_keyword_side_input_hint 
(apache_beam.typehints.typed_pipeline_test.SideInputTest) ... ok
test_any_compatibility 
(apache_beam.typehints.typehints_test.AnyTypeConstraintTestCase) ... ok
test_repr (apache_beam.typehints.typehints_test.AnyTypeConstraintTestCase) ... 
ok
test_type_check 
(apache_beam.typehints.typehints_test.AnyTypeConstraintTestCase) ... ok
test_composite_takes_and_returns_hints 
(apache_beam.typehints.typehints_test.CombinedReturnsAndTakesTestCase) ... ok
test_enable_and_disable_type_checking_returns 
(apache_beam.typehints.typehints_test.CombinedReturnsAndTakesTestCase) ... ok
test_enable_and_disable_type_checking_takes 
(apache_beam.typehints.typehints_test.CombinedReturnsAndTakesTestCase) ... ok
test_simple_takes_and_returns_hints 
(apache_beam.typehints.typehints_test.CombinedReturnsAndTakesTestCase) ... ok
test_valid_mix_pos_and_keyword_with_both_orders 
(apache_beam.typehints.typehints_test.CombinedReturnsAndTakesTestCase) ... ok
test_getcallargs_forhints 
(apache_beam.typehints.typehints_test.DecoratorHelpers) ... ok
test_hint_helper (apache_beam.typehints.typehints_test.DecoratorHelpers) ... ok
test_positional_arg_hints 
(apache_beam.typehints.typehints_test.DecoratorHelpers) ... ok
test_compatibility (apache_beam.typehints.typehints_test.DictHintTestCase) ... 
ok
test_getitem_param_must_be_tuple 
(apache_beam.typehints.typehints_test.DictHintTestCase) ... 
:497:
 DeprecationWarning: BaseException.message has been deprecated as of Python 2.6
  e.exception.message)
ok
test_getitem_param_must_have_length_2 
(apache_beam.typehints.typehints_test.DictHintTestCase) ... ok
test_key_type_must_be_valid_composite_param 
(apache_beam.typehints.typehints_test.DictHintTestCase) ... ok
test_match_type_variables 
(apache_beam.typehints.typehints_test.DictHintTestCase) ... ok
test_repr (apache_beam.typehints.typehints_test.DictHintTestCase) ... ok
test_type_check_invalid_key_type 
(apache_beam.typehints.typehints_test.DictHintTestCase) ... ok
test_type_check_invalid_value_type 
(apache_beam.typehints.typehints_test.DictHintTestCase) ... ok
test_type_check_valid_composite_type 
(apache_beam.typehints.typehints_test.DictHintTestCase) ... ok
test_type_check_valid_simple_type 
(apache_beam.typehints.typehints_test.DictHintTestCase) ... ok
test_type_checks_not_dict 
(apache_beam.typehints.typehints_test.DictHintTestCase) ... ok
test_value_type_must_be_valid_composite_param 
(apache_beam.typehints.typehints_test.DictHintTestCase) ... ok
test_compatibility (apache_beam.typehints.typehints_test.GeneratorHintTestCase) 
... ok
test_generator_argument_hint_invalid_yield_type 
(apache_beam.typehints.typehints_test.GeneratorHintTestCase) ... ok
test_generator_return_hint_invalid_yield_type 
(apache_beam.typehints.typehints_test.GeneratorHintTestCase) ... ok
test_repr (apache_beam.typehints.typehints_test.GeneratorHintTestCase) ... ok
test_compatibility (apache_beam.typehints.typehints_test.IterableHintTestCase) 
... ok
test_getitem_invalid_composite_type_param 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_repr (apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_tuple_compatibility 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_type_check_must_be_iterable 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_type_check_violation_invalid_composite_type 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_type_check_violation_invalid_simple_type 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_type_check_violation_valid_composite_type 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_type_check_violation_valid_simple_type 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_enforce_kv_type_constraint 
(apache_beam.typehints.typehints_test.KVHintTestCase) ... ok
test_getitem_param_must_be_tuple 
(apache_beam.typehints.typehints_test.KVHintTestCase) ... ok
test_getitem_param_must_have_length_2 
(apache_beam.typehints.typehints_test.KVHintTestCase) ... ok
test_getitem_proxy_to_tuple 
(apache_beam.typehints.typehints_test.KVHintTestCase) ... ok
test_enforce_list_type_constraint_invalid_composite_type 

[jira] [Commented] (BEAM-983) runners/spark/translation/streaming/utils/TestPipelineOptions.java missing Apache license header.

2016-11-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667939#comment-15667939
 ] 

ASF GitHub Bot commented on BEAM-983:
-

GitHub user amitsela opened a pull request:

https://github.com/apache/incubator-beam/pull/1362

[BEAM-983] add missing license.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/amitsela/incubator-beam BEAM-983

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1362.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1362


commit 19d180594a7cce258fc78632521a6536bde85935
Author: Sela 
Date:   2016-11-15T18:53:38Z

[BEAM-983] add missing license.




> runners/spark/translation/streaming/utils/TestPipelineOptions.java missing 
> Apache license header.
> -
>
> Key: BEAM-983
> URL: https://issues.apache.org/jira/browse/BEAM-983
> Project: Beam
>  Issue Type: Bug
>Reporter: Jason Kuster
>Assignee: Amit Sela
>
> Pull request #1332 (https://github.com/apache/incubator-beam/pull/1332) is 
> failing in beam_PostCommit_MavenVerify with an Apache Rat failure -- 
> https://builds.apache.org/job/beam_PostCommit_MavenVerify/1827/ -- Rat output 
> points to 
> runners/spark/src/test/java/org/apache/beam/runners/spark/translation/streaming/utils/TestPipelineOptions.java
>  as the culprit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1362: [BEAM-983] add missing license.

2016-11-15 Thread amitsela
GitHub user amitsela opened a pull request:

https://github.com/apache/incubator-beam/pull/1362

[BEAM-983] add missing license.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/amitsela/incubator-beam BEAM-983

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1362.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1362


commit 19d180594a7cce258fc78632521a6536bde85935
Author: Sela 
Date:   2016-11-15T18:53:38Z

[BEAM-983] add missing license.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/3] incubator-beam git commit: Add a couple of missing coder tests.

2016-11-15 Thread robertwb
Add a couple of missing coder tests.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/ab02a1d6
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/ab02a1d6
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/ab02a1d6

Branch: refs/heads/python-sdk
Commit: ab02a1d60ce70f31c087a78a940bb4833b477ebb
Parents: c4208a8
Author: Robert Bradshaw 
Authored: Wed Nov 9 13:47:53 2016 -0800
Committer: Robert Bradshaw 
Committed: Tue Nov 15 10:48:35 2016 -0800

--
 .../apache_beam/coders/coders_test_common.py| 24 
 1 file changed, 19 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/ab02a1d6/sdks/python/apache_beam/coders/coders_test_common.py
--
diff --git a/sdks/python/apache_beam/coders/coders_test_common.py 
b/sdks/python/apache_beam/coders/coders_test_common.py
index adeb6a5..2ec8e7f 100644
--- a/sdks/python/apache_beam/coders/coders_test_common.py
+++ b/sdks/python/apache_beam/coders/coders_test_common.py
@@ -59,13 +59,9 @@ class CodersTest(unittest.TestCase):
'Base' not in c.__name__)
 standard -= set([coders.Coder,
  coders.FastCoder,
- coders.Base64PickleCoder,
- coders.FloatCoder,
  coders.ProtoCoder,
- coders.TimestampCoder,
  coders.ToStringCoder,
- coders.WindowCoder,
- coders.WindowedValueCoder])
+ coders.WindowCoder])
 assert not standard - cls.seen, standard - cls.seen
 assert not standard - cls.seen_nested, standard - cls.seen_nested
 
@@ -155,6 +151,9 @@ class CodersTest(unittest.TestCase):
 self.check_coder(coders.FloatCoder(),
  *[float(2 ** (0.1 * x)) for x in range(-100, 100)])
 self.check_coder(coders.FloatCoder(), float('-Inf'), float('Inf'))
+self.check_coder(
+coders.TupleCoder((coders.FloatCoder(), coders.FloatCoder())),
+(0, 1), (-100, 100), (0.5, 0.25))
 
   def test_singleton_coder(self):
 a = 'anything'
@@ -173,6 +172,9 @@ class CodersTest(unittest.TestCase):
 self.check_coder(coders.TimestampCoder(),
  timestamp.Timestamp(micros=-1234567890123456789),
  timestamp.Timestamp(micros=1234567890123456789))
+self.check_coder(
+coders.TupleCoder((coders.TimestampCoder(), coders.BytesCoder())),
+(timestamp.Timestamp.of(27), 'abc'))
 
   def test_tuple_coder(self):
 self.check_coder(
@@ -209,6 +211,18 @@ class CodersTest(unittest.TestCase):
coders.IterableCoder(coders.VarIntCoder(,
 (1, [1, 2, 3]))
 
+  def test_windowed_value_coder(self):
+self.check_coder(
+coders.WindowedValueCoder(coders.VarIntCoder()),
+windowed_value.WindowedValue(3, -100, ()),
+windowed_value.WindowedValue(-1, 100, (1, 2, 3)))
+self.check_coder(
+coders.TupleCoder((
+coders.WindowedValueCoder(coders.FloatCoder()),
+coders.WindowedValueCoder(coders.StrUtf8Coder(,
+(windowed_value.WindowedValue(1.5, 0, ()),
+ windowed_value.WindowedValue("abc", 10, ('window',
+
   def test_proto_coder(self):
 # For instructions on how these test proto message were generated,
 # see coders_test.py



[GitHub] incubator-beam pull request #1326: Additional Coder tests

2016-11-15 Thread robertwb
Github user robertwb closed the pull request at:

https://github.com/apache/incubator-beam/pull/1326


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Closed] (BEAM-984) rat plugin fails build because of a missing license.

2016-11-15 Thread Amit Sela (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Sela closed BEAM-984.
--
Resolution: Duplicate

> rat plugin fails build because of a missing license.
> 
>
> Key: BEAM-984
> URL: https://issues.apache.org/jira/browse/BEAM-984
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Affects Versions: 0.3.0-incubating
>Reporter: Amit Sela
>Assignee: Amit Sela
> Fix For: 0.4.0-incubating
>
>
> Somehow missed by PreCommit.
> The file is: 
> https://github.com/apache/incubator-beam/blob/master/runners/spark/src/test/java/org/apache/beam/runners/spark/translation/streaming/utils/TestPipelineOptions.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-983) runners/spark/translation/streaming/utils/TestPipelineOptions.java missing Apache license header.

2016-11-15 Thread Amit Sela (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667929#comment-15667929
 ] 

Amit Sela commented on BEAM-983:


I'll fix this now, thanks [~jasonkuster]

> runners/spark/translation/streaming/utils/TestPipelineOptions.java missing 
> Apache license header.
> -
>
> Key: BEAM-983
> URL: https://issues.apache.org/jira/browse/BEAM-983
> Project: Beam
>  Issue Type: Bug
>Reporter: Jason Kuster
>Assignee: Amit Sela
>
> Pull request #1332 (https://github.com/apache/incubator-beam/pull/1332) is 
> failing in beam_PostCommit_MavenVerify with an Apache Rat failure -- 
> https://builds.apache.org/job/beam_PostCommit_MavenVerify/1827/ -- Rat output 
> points to 
> runners/spark/src/test/java/org/apache/beam/runners/spark/translation/streaming/utils/TestPipelineOptions.java
>  as the culprit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-984) rat plugin fails build because of a missing license.

2016-11-15 Thread Amit Sela (JIRA)
Amit Sela created BEAM-984:
--

 Summary: rat plugin fails build because of a missing license.
 Key: BEAM-984
 URL: https://issues.apache.org/jira/browse/BEAM-984
 Project: Beam
  Issue Type: Bug
  Components: runner-spark
Affects Versions: 0.3.0-incubating
Reporter: Amit Sela
Assignee: Amit Sela
 Fix For: 0.4.0-incubating


Somehow missed by PreCommit.
The file is: 
https://github.com/apache/incubator-beam/blob/master/runners/spark/src/test/java/org/apache/beam/runners/spark/translation/streaming/utils/TestPipelineOptions.java




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-983) runners/spark/translation/streaming/utils/TestPipelineOptions.java missing Apache license header.

2016-11-15 Thread Jason Kuster (JIRA)
Jason Kuster created BEAM-983:
-

 Summary: 
runners/spark/translation/streaming/utils/TestPipelineOptions.java missing 
Apache license header.
 Key: BEAM-983
 URL: https://issues.apache.org/jira/browse/BEAM-983
 Project: Beam
  Issue Type: Bug
Reporter: Jason Kuster
Assignee: Stas Levin


Pull request #1332 (https://github.com/apache/incubator-beam/pull/1332) is 
failing in beam_PostCommit_MavenVerify with an Apache Rat failure -- 
https://builds.apache.org/job/beam_PostCommit_MavenVerify/1827/ -- Rat output 
points to 
runners/spark/src/test/java/org/apache/beam/runners/spark/translation/streaming/utils/TestPipelineOptions.java
 as the culprit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Build failed in Jenkins: beam_PostCommit_MavenVerify #1827

2016-11-15 Thread Amit Sela
A new file is missing license. Weird... PreCommit passed. I'll open a
ticket and fix this.

On Tue, Nov 15, 2016 at 8:32 PM Jason Kuster 
wrote:

> Investigating...
>
> On Tue, Nov 15, 2016 at 10:29 AM, Apache Jenkins Server <
> jenk...@builds.apache.org> wrote:
>
> > See  > MavenVerify/1827/changes>
> >
> > Changes:
> >
> > [ansela] [BEAM-891] fix build occasionally fails on
> > IndexOutOfBoundsException.
> >
> > --
> > [...truncated 5 lines...]
> > Building remotely on beam3 (beam) in workspace <
> https://builds.apache.org/
> > job/beam_PostCommit_MavenVerify/ws/>
> >  > git rev-parse --is-inside-work-tree # timeout=10
> > Fetching changes from the remote Git repository
> >  > git config remote.origin.url https://github.com/apache/
> > incubator-beam.git # timeout=10
> > Fetching upstream changes from https://github.com/apache/
> > incubator-beam.git
> >  > git --version # timeout=10
> >  > git -c core.askpass=true fetch --tags --progress
> > https://github.com/apache/incubator-beam.git +refs/heads/*:refs/remotes/
> > origin/*
> >  > git rev-parse refs/remotes/origin/master^{commit} # timeout=10
> >  > git rev-parse refs/remotes/origin/origin/master^{commit} # timeout=10
> > Checking out Revision 503f26f448ea9f46fcfcdd46e60cba80e4844e28
> > (refs/remotes/origin/master)
> >  > git config core.sparsecheckout # timeout=10
> >  > git checkout -f 503f26f448ea9f46fcfcdd46e60cba80e4844e28
> >  > git rev-list 47646d641848cae9d7adefa6dd537a10cade7b1b # timeout=10
> > [EnvInject] - Executing scripts and injecting environment variables after
> > the SCM step.
> > [EnvInject] - Injecting as environment variables the properties content
> > SPARK_LOCAL_IP=127.0.0.1
> >
> > [EnvInject] - Variables injected successfully.
> > Parsing POMs
> > Modules changed, recalculating dependency graph
> > Established TCP socket on 35235
> > maven32-agent.jar already up to date
> > maven32-interceptor.jar already up to date
> > maven3-interceptor-commons.jar already up to date
> > [beam_PostCommit_MavenVerify] $
> /home/jenkins/tools/java/latest1.8/bin/java
> > -Xmx2g -Xms256m -XX:MaxPermSize=512m -cp /home/jenkins/jenkins-slave/
> > maven32-agent.jar:/home/jenkins/tools/maven/apache-
> > maven-3.3.3/boot/plexus-classworlds-2.5.2.jar:/home/
> > jenkins/tools/maven/apache-maven-3.3.3/conf/logging
> jenkins.maven3.agent.Maven32Main
> > /home/jenkins/tools/maven/apache-maven-3.3.3
> /home/jenkins/jenkins-slave/slave.jar
> > /home/jenkins/jenkins-slave/maven32-interceptor.jar
> > /home/jenkins/jenkins-slave/maven3-interceptor-commons.jar 35235
> > Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
> > MaxPermSize=512m; support was removed in 8.0
> > <===[JENKINS REMOTING CAPACITY]===>   channel started
> > Executing Maven:  -B -f  > MavenVerify/ws/pom.xml> -Dmaven.repo.local= > builds.apache.org/job/beam_PostCommit_MavenVerify/ws/.repository> -B -e
> > -P release,dataflow-runner -DrepoToken= clean install
> coveralls:report
> > -DskipITs=false -DintegrationTestPipelineOptions=[
> "--project=apache-beam-testing",
> > "--tempRoot=gs://temp-storage-for-end-to-end-tests",
> > "--runner=org.apache.beam.runners.dataflow.testing.TestDataflowRunner" ]
> > [INFO] Error stacktraces are turned on.
> > [INFO] Scanning for projects...
> > [INFO] 
> > 
> > [INFO] Reactor Build Order:
> > [INFO]
> > [INFO] Apache Beam :: Parent
> > [INFO] Apache Beam :: SDKs :: Java :: Build Tools
> > [INFO] Apache Beam :: SDKs
> > [INFO] Apache Beam :: SDKs :: Java
> > [INFO] Apache Beam :: SDKs :: Java :: Core
> > [INFO] Apache Beam :: Runners
> > [INFO] Apache Beam :: Runners :: Core Java
> > [INFO] Apache Beam :: Runners :: Direct Java
> > [INFO] Apache Beam :: Runners :: Google Cloud Dataflow
> > [INFO] Apache Beam :: SDKs :: Java :: IO
> > [INFO] Apache Beam :: SDKs :: Java :: IO :: Google Cloud Platform
> > [INFO] Apache Beam :: SDKs :: Java :: IO :: HDFS
> > [INFO] Apache Beam :: SDKs :: Java :: IO :: JMS
> > [INFO] Apache Beam :: SDKs :: Java :: IO :: Kafka
> > [INFO] Apache Beam :: SDKs :: Java :: IO :: Kinesis
> > [INFO] Apache Beam :: SDKs :: Java :: IO :: MongoDB
> > [INFO] Apache Beam :: SDKs :: Java :: IO :: JDBC
> > [INFO] Apache Beam :: SDKs :: Java :: Maven Archetypes
> > [INFO] Apache Beam :: SDKs :: Java :: Maven Archetypes :: Starter
> > [INFO] Apache Beam :: SDKs :: Java :: Maven Archetypes :: Examples
> > [INFO] Apache Beam :: SDKs :: Java :: Extensions
> > [INFO] Apache Beam :: SDKs :: Java :: Extensions :: Join library
> > [INFO] Apache Beam :: SDKs :: Java :: Extensions :: Sorter
> > [INFO] Apache Beam :: SDKs :: Java :: Microbenchmarks
> > [INFO] Apache Beam :: SDKs :: Java :: Java 8 Tests
> > [INFO] Apache Beam :: Runners :: Flink
> > [INFO] Apache Beam :: Runners :: Flink :: 

Build failed in Jenkins: beam_PostCommit_MavenVerify #1827

2016-11-15 Thread Apache Jenkins Server
See 

Changes:

[ansela] [BEAM-891] fix build occasionally fails on IndexOutOfBoundsException.

--
[...truncated 5 lines...]
Building remotely on beam3 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/incubator-beam.git # 
 > timeout=10
Fetching upstream changes from https://github.com/apache/incubator-beam.git
 > git --version # timeout=10
 > git -c core.askpass=true fetch --tags --progress 
 > https://github.com/apache/incubator-beam.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/master^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/master^{commit} # timeout=10
Checking out Revision 503f26f448ea9f46fcfcdd46e60cba80e4844e28 
(refs/remotes/origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 503f26f448ea9f46fcfcdd46e60cba80e4844e28
 > git rev-list 47646d641848cae9d7adefa6dd537a10cade7b1b # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
Parsing POMs
Modules changed, recalculating dependency graph
Established TCP socket on 35235
maven32-agent.jar already up to date
maven32-interceptor.jar already up to date
maven3-interceptor-commons.jar already up to date
[beam_PostCommit_MavenVerify] $ /home/jenkins/tools/java/latest1.8/bin/java 
-Xmx2g -Xms256m -XX:MaxPermSize=512m -cp 
/home/jenkins/jenkins-slave/maven32-agent.jar:/home/jenkins/tools/maven/apache-maven-3.3.3/boot/plexus-classworlds-2.5.2.jar:/home/jenkins/tools/maven/apache-maven-3.3.3/conf/logging
 jenkins.maven3.agent.Maven32Main /home/jenkins/tools/maven/apache-maven-3.3.3 
/home/jenkins/jenkins-slave/slave.jar 
/home/jenkins/jenkins-slave/maven32-interceptor.jar 
/home/jenkins/jenkins-slave/maven3-interceptor-commons.jar 35235
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; 
support was removed in 8.0
<===[JENKINS REMOTING CAPACITY]===>   channel started
Executing Maven:  -B -f 
 
-Dmaven.repo.local=
 -B -e -P release,dataflow-runner -DrepoToken= clean install 
coveralls:report -DskipITs=false -DintegrationTestPipelineOptions=[ 
"--project=apache-beam-testing", 
"--tempRoot=gs://temp-storage-for-end-to-end-tests", 
"--runner=org.apache.beam.runners.dataflow.testing.TestDataflowRunner" ]
[INFO] Error stacktraces are turned on.
[INFO] Scanning for projects...
[INFO] 
[INFO] Reactor Build Order:
[INFO] 
[INFO] Apache Beam :: Parent
[INFO] Apache Beam :: SDKs :: Java :: Build Tools
[INFO] Apache Beam :: SDKs
[INFO] Apache Beam :: SDKs :: Java
[INFO] Apache Beam :: SDKs :: Java :: Core
[INFO] Apache Beam :: Runners
[INFO] Apache Beam :: Runners :: Core Java
[INFO] Apache Beam :: Runners :: Direct Java
[INFO] Apache Beam :: Runners :: Google Cloud Dataflow
[INFO] Apache Beam :: SDKs :: Java :: IO
[INFO] Apache Beam :: SDKs :: Java :: IO :: Google Cloud Platform
[INFO] Apache Beam :: SDKs :: Java :: IO :: HDFS
[INFO] Apache Beam :: SDKs :: Java :: IO :: JMS
[INFO] Apache Beam :: SDKs :: Java :: IO :: Kafka
[INFO] Apache Beam :: SDKs :: Java :: IO :: Kinesis
[INFO] Apache Beam :: SDKs :: Java :: IO :: MongoDB
[INFO] Apache Beam :: SDKs :: Java :: IO :: JDBC
[INFO] Apache Beam :: SDKs :: Java :: Maven Archetypes
[INFO] Apache Beam :: SDKs :: Java :: Maven Archetypes :: Starter
[INFO] Apache Beam :: SDKs :: Java :: Maven Archetypes :: Examples
[INFO] Apache Beam :: SDKs :: Java :: Extensions
[INFO] Apache Beam :: SDKs :: Java :: Extensions :: Join library
[INFO] Apache Beam :: SDKs :: Java :: Extensions :: Sorter
[INFO] Apache Beam :: SDKs :: Java :: Microbenchmarks
[INFO] Apache Beam :: SDKs :: Java :: Java 8 Tests
[INFO] Apache Beam :: Runners :: Flink
[INFO] Apache Beam :: Runners :: Flink :: Core
[INFO] Apache Beam :: Runners :: Flink :: Examples
[INFO] Apache Beam :: Runners :: Spark
[INFO] Apache Beam :: Runners :: Apex
[INFO] Apache Beam :: Examples
[INFO] Apache Beam :: Examples :: Java
[INFO] Apache Beam :: Examples :: Java 8
[WARNING] The POM for org.eclipse.m2e:lifecycle-mapping:jar:1.0.0 is missing, 
no dependency information available
[WARNING] Failed to retrieve plugin descriptor for 
org.eclipse.m2e:lifecycle-mapping:1.0.0: Plugin 
org.eclipse.m2e:lifecycle-mapping:1.0.0 or one of its dependencies could not be 
resolved: Failure to find org.eclipse.m2e:lifecycle-mapping:jar:1.0.0 in 
https://repo.maven.apache.org/maven2 was 

[1/2] incubator-beam git commit: [BEAM-891] fix build occasionally fails on IndexOutOfBoundsException.

2016-11-15 Thread amitsela
Repository: incubator-beam
Updated Branches:
  refs/heads/master 47646d641 -> 503f26f44


[BEAM-891] fix build occasionally fails on IndexOutOfBoundsException.

Moved "TestPipelineOptions#withTmpCheckpointDir" to 
TestPipelineOptionsForStreaming.
Removed an unused member in ProvidedSparkContextTest.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/0331dd1c
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/0331dd1c
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/0331dd1c

Branch: refs/heads/master
Commit: 0331dd1cd75e60484f0b15723e4e7edc280a4d12
Parents: 47646d6
Author: Stas Levin 
Authored: Thu Nov 10 13:32:51 2016 +0200
Committer: Sela 
Committed: Tue Nov 15 19:52:12 2016 +0200

--
 runners/spark/pom.xml   |  3 +-
 .../runners/spark/SparkPipelineOptions.java |  4 +-
 .../spark/translation/SparkRuntimeContext.java  |  4 +-
 .../runners/spark/ProvidedSparkContextTest.java | 26 -
 .../metrics/sink/NamedAggregatorsTest.java  | 13 +++--
 .../beam/runners/spark/io/AvroPipelineTest.java | 12 ++---
 .../beam/runners/spark/io/NumShardsTest.java| 10 ++--
 .../io/hadoop/HadoopFileFormatPipelineTest.java | 12 ++---
 .../spark/translation/SideEffectsTest.java  | 11 ++--
 .../streaming/EmptyStreamAssertionTest.java |  4 +-
 .../streaming/FlattenStreamingTest.java |  4 +-
 .../streaming/KafkaStreamingTest.java   |  4 +-
 .../ResumeFromCheckpointStreamingTest.java  |  4 +-
 .../streaming/SimpleStreamingWordCountTest.java |  6 +--
 .../utils/TestOptionsForStreaming.java  | 55 
 .../streaming/utils/TestPipelineOptions.java| 25 +
 .../utils/TestPipelineOptionsForStreaming.java  | 44 
 17 files changed, 132 insertions(+), 109 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/0331dd1c/runners/spark/pom.xml
--
diff --git a/runners/spark/pom.xml b/runners/spark/pom.xml
index 1e4a720..4c5b3f5 100644
--- a/runners/spark/pom.xml
+++ b/runners/spark/pom.xml
@@ -82,7 +82,8 @@
 
   [
 "--runner=TestSparkRunner",
-"--streaming=false"
+"--streaming=false",
+"--enableSparkMetricSinks=false"
   ]
 
 
true

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/0331dd1c/runners/spark/src/main/java/org/apache/beam/runners/spark/SparkPipelineOptions.java
--
diff --git 
a/runners/spark/src/main/java/org/apache/beam/runners/spark/SparkPipelineOptions.java
 
b/runners/spark/src/main/java/org/apache/beam/runners/spark/SparkPipelineOptions.java
index 5168c6c..b1ebde9 100644
--- 
a/runners/spark/src/main/java/org/apache/beam/runners/spark/SparkPipelineOptions.java
+++ 
b/runners/spark/src/main/java/org/apache/beam/runners/spark/SparkPipelineOptions.java
@@ -88,8 +88,8 @@ public interface SparkPipelineOptions
 
   @Description("Enable/disable sending aggregator values to Spark's metric 
sinks")
   @Default.Boolean(true)
-  Boolean getEnableSparkSinks();
-  void setEnableSparkSinks(Boolean enableSparkSinks);
+  Boolean getEnableSparkMetricSinks();
+  void setEnableSparkMetricSinks(Boolean enableSparkMetricSinks);
 
   @Description("If the spark runner will be initialized with a provided Spark 
Context. "
   + "The Spark Context should be provided with SparkContextOptions.")

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/0331dd1c/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/SparkRuntimeContext.java
--
diff --git 
a/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/SparkRuntimeContext.java
 
b/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/SparkRuntimeContext.java
index 181a111..564db39 100644
--- 
a/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/SparkRuntimeContext.java
+++ 
b/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/SparkRuntimeContext.java
@@ -86,11 +86,11 @@ public class SparkRuntimeContext implements Serializable {
 final Accumulator accum = 
AccumulatorSingleton.getInstance(jsc);
 final NamedAggregators initialValue = accum.value();
 
-if (opts.getEnableSparkSinks()) {
+if (opts.getEnableSparkMetricSinks()) {
   final MetricsSystem metricsSystem = 
SparkEnv$.MODULE$.get().metricsSystem();
   final AggregatorMetricSource aggregatorMetricSource =
 

[jira] [Resolved] (BEAM-891) Flake in Spark metrics library?

2016-11-15 Thread Amit Sela (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Sela resolved BEAM-891.

   Resolution: Fixed
Fix Version/s: 0.4.0-incubating

> Flake in Spark metrics library?
> ---
>
> Key: BEAM-891
> URL: https://issues.apache.org/jira/browse/BEAM-891
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Daniel Halperin
>Assignee: Stas Levin
> Fix For: 0.4.0-incubating
>
>
> [~staslev] I think you implemented this functionality originally? Want to 
> take a look? CC [~amitsela]
> Run: 
> https://builds.apache.org/job/beam_PostCommit_RunnableOnService_SparkLocal/org.apache.beam$beam-runners-spark/43/testReport/junit/org.apache.beam.sdk.transforms/FilterTest/testFilterGreaterThan/
> Error:
> {code}
> java.lang.RuntimeException: java.lang.IndexOutOfBoundsException: 5
>   at org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:169)
>   at 
> org.apache.beam.runners.spark.TestSparkRunner.run(TestSparkRunner.java:77)
>   at 
> org.apache.beam.runners.spark.TestSparkRunner.run(TestSparkRunner.java:53)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:182)
>   at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:112)
>   at 
> org.apache.beam.sdk.transforms.FilterTest.testFilterGreaterThan(FilterTest.java:122)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at 
> org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:393)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IndexOutOfBoundsException: 5
>   at 
> scala.collection.mutable.ResizableArray$class.apply(ResizableArray.scala:43)
>   at scala.collection.mutable.ArrayBuffer.apply(ArrayBuffer.scala:47)
>   at 
> scala.collection.IndexedSeqOptimized$class.segmentLength(IndexedSeqOptimized.scala:189)
>   at 
> scala.collection.mutable.ArrayBuffer.segmentLength(ArrayBuffer.scala:47)
>   at 
> scala.collection.IndexedSeqOptimized$class.indexWhere(IndexedSeqOptimized.scala:198)
>   at scala.collection.mutable.ArrayBuffer.indexWhere(ArrayBuffer.scala:47)
>   at scala.collection.GenSeqLike$class.indexOf(GenSeqLike.scala:144)
>   at scala.collection.AbstractSeq.indexOf(Seq.scala:40)
>   at scala.collection.GenSeqLike$class.indexOf(GenSeqLike.scala:128)
>   at scala.collection.AbstractSeq.indexOf(Seq.scala:40)
>   at 
> scala.collection.mutable.BufferLike$class.$minus$eq(BufferLike.scala:126)
>   at scala.collection.mutable.AbstractBuffer.$minus$eq(Buffer.scala:48)
>   at 
> org.apache.spark.metrics.MetricsSystem.removeSource(MetricsSystem.scala:159)
>   at 
> org.apache.beam.runners.spark.translation.SparkRuntimeContext.registerMetrics(SparkRuntimeContext.java:94)
>   at 
> org.apache.beam.runners.spark.translation.SparkRuntimeContext.(SparkRuntimeContext.java:66)
>   at 
> org.apache.beam.runners.spark.translation.EvaluationContext.(EvaluationContext.java:73)
>   at org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:146)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-891) Flake in Spark metrics library?

2016-11-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667833#comment-15667833
 ] 

ASF GitHub Bot commented on BEAM-891:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1332


> Flake in Spark metrics library?
> ---
>
> Key: BEAM-891
> URL: https://issues.apache.org/jira/browse/BEAM-891
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Daniel Halperin
>Assignee: Stas Levin
>
> [~staslev] I think you implemented this functionality originally? Want to 
> take a look? CC [~amitsela]
> Run: 
> https://builds.apache.org/job/beam_PostCommit_RunnableOnService_SparkLocal/org.apache.beam$beam-runners-spark/43/testReport/junit/org.apache.beam.sdk.transforms/FilterTest/testFilterGreaterThan/
> Error:
> {code}
> java.lang.RuntimeException: java.lang.IndexOutOfBoundsException: 5
>   at org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:169)
>   at 
> org.apache.beam.runners.spark.TestSparkRunner.run(TestSparkRunner.java:77)
>   at 
> org.apache.beam.runners.spark.TestSparkRunner.run(TestSparkRunner.java:53)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:182)
>   at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:112)
>   at 
> org.apache.beam.sdk.transforms.FilterTest.testFilterGreaterThan(FilterTest.java:122)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at 
> org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:393)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IndexOutOfBoundsException: 5
>   at 
> scala.collection.mutable.ResizableArray$class.apply(ResizableArray.scala:43)
>   at scala.collection.mutable.ArrayBuffer.apply(ArrayBuffer.scala:47)
>   at 
> scala.collection.IndexedSeqOptimized$class.segmentLength(IndexedSeqOptimized.scala:189)
>   at 
> scala.collection.mutable.ArrayBuffer.segmentLength(ArrayBuffer.scala:47)
>   at 
> scala.collection.IndexedSeqOptimized$class.indexWhere(IndexedSeqOptimized.scala:198)
>   at scala.collection.mutable.ArrayBuffer.indexWhere(ArrayBuffer.scala:47)
>   at scala.collection.GenSeqLike$class.indexOf(GenSeqLike.scala:144)
>   at scala.collection.AbstractSeq.indexOf(Seq.scala:40)
>   at scala.collection.GenSeqLike$class.indexOf(GenSeqLike.scala:128)
>   at scala.collection.AbstractSeq.indexOf(Seq.scala:40)
>   at 
> scala.collection.mutable.BufferLike$class.$minus$eq(BufferLike.scala:126)
>   at scala.collection.mutable.AbstractBuffer.$minus$eq(Buffer.scala:48)
>   at 
> org.apache.spark.metrics.MetricsSystem.removeSource(MetricsSystem.scala:159)
>   at 
> org.apache.beam.runners.spark.translation.SparkRuntimeContext.registerMetrics(SparkRuntimeContext.java:94)
>   at 
> org.apache.beam.runners.spark.translation.SparkRuntimeContext.(SparkRuntimeContext.java:66)
>   at 
> org.apache.beam.runners.spark.translation.EvaluationContext.(EvaluationContext.java:73)
>   at org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:146)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1332: [BEAM-891] Trying to fix an occasionally ...

2016-11-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1332


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] incubator-beam git commit: This closes #1332

2016-11-15 Thread amitsela
This closes #1332


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/503f26f4
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/503f26f4
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/503f26f4

Branch: refs/heads/master
Commit: 503f26f448ea9f46fcfcdd46e60cba80e4844e28
Parents: 47646d6 0331dd1
Author: Sela 
Authored: Tue Nov 15 19:53:52 2016 +0200
Committer: Sela 
Committed: Tue Nov 15 19:53:52 2016 +0200

--
 runners/spark/pom.xml   |  3 +-
 .../runners/spark/SparkPipelineOptions.java |  4 +-
 .../spark/translation/SparkRuntimeContext.java  |  4 +-
 .../runners/spark/ProvidedSparkContextTest.java | 26 -
 .../metrics/sink/NamedAggregatorsTest.java  | 13 +++--
 .../beam/runners/spark/io/AvroPipelineTest.java | 12 ++---
 .../beam/runners/spark/io/NumShardsTest.java| 10 ++--
 .../io/hadoop/HadoopFileFormatPipelineTest.java | 12 ++---
 .../spark/translation/SideEffectsTest.java  | 11 ++--
 .../streaming/EmptyStreamAssertionTest.java |  4 +-
 .../streaming/FlattenStreamingTest.java |  4 +-
 .../streaming/KafkaStreamingTest.java   |  4 +-
 .../ResumeFromCheckpointStreamingTest.java  |  4 +-
 .../streaming/SimpleStreamingWordCountTest.java |  6 +--
 .../utils/TestOptionsForStreaming.java  | 55 
 .../streaming/utils/TestPipelineOptions.java| 25 +
 .../utils/TestPipelineOptionsForStreaming.java  | 44 
 17 files changed, 132 insertions(+), 109 deletions(-)
--




[GitHub] incubator-beam pull request #1330: Remove tox cache from previous workspace

2016-11-15 Thread vikkyrk
Github user vikkyrk closed the pull request at:

https://github.com/apache/incubator-beam/pull/1330


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-930) Findbugs doesn't pass in MongoDB

2016-11-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667693#comment-15667693
 ] 

ASF GitHub Bot commented on BEAM-930:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1361


> Findbugs doesn't pass in MongoDB
> 
>
> Key: BEAM-930
> URL: https://issues.apache.org/jira/browse/BEAM-930
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Daniel Halperin
>Assignee: Jean-Baptiste Onofré
> Fix For: 0.4.0-incubating
>
>
> {code}
> [INFO] --- findbugs-maven-plugin:3.0.1:check (default) @ 
> beam-sdks-java-io-mongodb ---
> [INFO] BugInstance size is 2
> [INFO] Error size is 0
> [INFO] Total bugs: 2
> [INFO] Boxing/unboxing to parse a primitive 
> org.apache.beam.sdk.io.mongodb.MongoDbIO$BoundedMongoDbSource.getEstimatedSizeBytes(PipelineOptions)
>  [org.apache.beam.sdk.io.mongodb.MongoDbIO$BoundedMongoDbSource] At 
> MongoDbIO.java:[line 227]
> [INFO] Class org.apache.beam.sdk.io.mongodb.MongoDbIO$Write$WriteFn defines 
> non-transient non-serializable instance field client 
> [org.apache.beam.sdk.io.mongodb.MongoDbIO$Write$WriteFn] In MongoDbIO.java
> [INFO]  
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1361: [BEAM-930] Fix findbugs and re-enable Mav...

2016-11-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1361


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (BEAM-981) Not possible to directly submit a pipeline on spark cluster

2016-11-15 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Baptiste Onofré updated BEAM-981:
--
Summary: Not possible to directly submit a pipeline on spark cluster  (was: 
Not possible to directly run pipeline on spark cluster)

> Not possible to directly submit a pipeline on spark cluster
> ---
>
> Key: BEAM-981
> URL: https://issues.apache.org/jira/browse/BEAM-981
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Affects Versions: 0.3.0-incubating, 0.4.0-incubating
>Reporter: Jean-Baptiste Onofré
>Assignee: Amit Sela
>
> It's not possible to directly run a pipeline on the spark runner (for 
> instance using {{mvn exec:java}}. It fails with:
> {code}
> [appclient-register-master-threadpool-0] INFO 
> org.apache.spark.deploy.client.AppClient$ClientEndpoint - Connecting to 
> master spark://10.200.118.197:7077...
> [shuffle-client-0] ERROR org.apache.spark.network.client.TransportClient - 
> Failed to send RPC 6813731522650020739 to /10.200.118.197:7077: 
> java.lang.AbstractMethodError: 
> org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
> java.lang.AbstractMethodError: 
> org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
> at io.netty.util.ReferenceCountUtil.touch(ReferenceCountUtil.java:73)
> at 
> io.netty.channel.DefaultChannelPipeline.touch(DefaultChannelPipeline.java:107)
> at 
> io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:820)
> at 
> io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:733)
> at 
> io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:111)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:748)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:740)
> at 
> io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:826)
> at 
> io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:733)
> at 
> io.netty.handler.timeout.IdleStateHandler.write(IdleStateHandler.java:284)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:748)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:740)
> at 
> io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:38)
> at 
> io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:1101)
> at 
> io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1148)
> at 
> io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1090)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor.safeExecute(SingleThreadEventExecutor.java:451)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:401)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:877)
> at java.lang.Thread.run(Thread.java:745)
> [appclient-register-master-threadpool-0] WARN 
> org.apache.spark.deploy.client.AppClient$ClientEndpoint - Failed to connect 
> to master 10.200.118.197:7077
> java.io.IOException: Failed to send RPC 6813731522650020739 to 
> /10.200.118.197:7077: java.lang.AbstractMethodError: 
> org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
> at 
> org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:239)
> at 
> org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:226)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:514)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:507)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:486)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:427)
> at 
> io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:129)
> at 
> io.netty.channel.AbstractChannelHandlerContext.notifyOutboundHandlerException(AbstractChannelHandlerContext.java:845)
>   

[jira] [Updated] (BEAM-981) Not possible to directly run pipeline on spark cluster

2016-11-15 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Baptiste Onofré updated BEAM-981:
--
Summary: Not possible to directly run pipeline on spark cluster  (was: Not 
possible to directly run pipeline on spark)

> Not possible to directly run pipeline on spark cluster
> --
>
> Key: BEAM-981
> URL: https://issues.apache.org/jira/browse/BEAM-981
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Affects Versions: 0.3.0-incubating, 0.4.0-incubating
>Reporter: Jean-Baptiste Onofré
>Assignee: Amit Sela
>
> It's not possible to directly run a pipeline on the spark runner (for 
> instance using {{mvn exec:java}}. It fails with:
> {code}
> [appclient-register-master-threadpool-0] INFO 
> org.apache.spark.deploy.client.AppClient$ClientEndpoint - Connecting to 
> master spark://10.200.118.197:7077...
> [shuffle-client-0] ERROR org.apache.spark.network.client.TransportClient - 
> Failed to send RPC 6813731522650020739 to /10.200.118.197:7077: 
> java.lang.AbstractMethodError: 
> org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
> java.lang.AbstractMethodError: 
> org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
> at io.netty.util.ReferenceCountUtil.touch(ReferenceCountUtil.java:73)
> at 
> io.netty.channel.DefaultChannelPipeline.touch(DefaultChannelPipeline.java:107)
> at 
> io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:820)
> at 
> io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:733)
> at 
> io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:111)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:748)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:740)
> at 
> io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:826)
> at 
> io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:733)
> at 
> io.netty.handler.timeout.IdleStateHandler.write(IdleStateHandler.java:284)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:748)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:740)
> at 
> io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:38)
> at 
> io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:1101)
> at 
> io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1148)
> at 
> io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1090)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor.safeExecute(SingleThreadEventExecutor.java:451)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:401)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:877)
> at java.lang.Thread.run(Thread.java:745)
> [appclient-register-master-threadpool-0] WARN 
> org.apache.spark.deploy.client.AppClient$ClientEndpoint - Failed to connect 
> to master 10.200.118.197:7077
> java.io.IOException: Failed to send RPC 6813731522650020739 to 
> /10.200.118.197:7077: java.lang.AbstractMethodError: 
> org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
> at 
> org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:239)
> at 
> org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:226)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:514)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:507)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:486)
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:427)
> at 
> io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:129)
> at 
> io.netty.channel.AbstractChannelHandlerContext.notifyOutboundHandlerException(AbstractChannelHandlerContext.java:845)
> at 
> 

[2/2] incubator-beam git commit: Closes #1304

2016-11-15 Thread robertwb
Closes #1304


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/c4208a89
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/c4208a89
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/c4208a89

Branch: refs/heads/python-sdk
Commit: c4208a899c38a92acdd95ccf37cb53237a593535
Parents: 6ac6e42 d0e3121
Author: Robert Bradshaw 
Authored: Tue Nov 15 08:53:07 2016 -0800
Committer: Robert Bradshaw 
Committed: Tue Nov 15 08:53:07 2016 -0800

--
 sdks/python/apache_beam/runners/dataflow_runner.py | 1 +
 sdks/python/apache_beam/utils/names.py | 1 +
 2 files changed, 2 insertions(+)
--




[jira] [Created] (BEAM-981) Not possible to directly run pipeline on spark

2016-11-15 Thread JIRA
Jean-Baptiste Onofré created BEAM-981:
-

 Summary: Not possible to directly run pipeline on spark
 Key: BEAM-981
 URL: https://issues.apache.org/jira/browse/BEAM-981
 Project: Beam
  Issue Type: Bug
  Components: runner-spark
Affects Versions: 0.3.0-incubating, 0.4.0-incubating
Reporter: Jean-Baptiste Onofré
Assignee: Amit Sela


It's not possible to directly run a pipeline on the spark runner (for instance 
using {{mvn exec:java}}. It fails with:

{code}
[appclient-register-master-threadpool-0] INFO 
org.apache.spark.deploy.client.AppClient$ClientEndpoint - Connecting to master 
spark://10.200.118.197:7077...
[shuffle-client-0] ERROR org.apache.spark.network.client.TransportClient - 
Failed to send RPC 6813731522650020739 to /10.200.118.197:7077: 
java.lang.AbstractMethodError: 
org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
java.lang.AbstractMethodError: 
org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
at io.netty.util.ReferenceCountUtil.touch(ReferenceCountUtil.java:73)
at 
io.netty.channel.DefaultChannelPipeline.touch(DefaultChannelPipeline.java:107)
at 
io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:820)
at 
io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:733)
at 
io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:111)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:748)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:740)
at 
io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:826)
at 
io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:733)
at 
io.netty.handler.timeout.IdleStateHandler.write(IdleStateHandler.java:284)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:748)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:740)
at 
io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:38)
at 
io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:1101)
at 
io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1148)
at 
io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1090)
at 
io.netty.util.concurrent.SingleThreadEventExecutor.safeExecute(SingleThreadEventExecutor.java:451)
at 
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:401)
at 
io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:877)
at java.lang.Thread.run(Thread.java:745)
[appclient-register-master-threadpool-0] WARN 
org.apache.spark.deploy.client.AppClient$ClientEndpoint - Failed to connect to 
master 10.200.118.197:7077
java.io.IOException: Failed to send RPC 6813731522650020739 to 
/10.200.118.197:7077: java.lang.AbstractMethodError: 
org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
at 
org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:239)
at 
org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:226)
at 
io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:514)
at 
io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:507)
at 
io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:486)
at 
io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:427)
at 
io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:129)
at 
io.netty.channel.AbstractChannelHandlerContext.notifyOutboundHandlerException(AbstractChannelHandlerContext.java:845)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:750)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:740)
at 
io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:826)
at 
io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:733)
at 

[1/2] incubator-beam git commit: Allow for passing format so that we can migrate to BQ Avro export later

2016-11-15 Thread robertwb
Repository: incubator-beam
Updated Branches:
  refs/heads/python-sdk 6ac6e420f -> c4208a899


Allow for passing format so that we can migrate to BQ Avro export later


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/d0e31218
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/d0e31218
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/d0e31218

Branch: refs/heads/python-sdk
Commit: d0e312184e319050baa02abff2c08348b6cfb651
Parents: 6ac6e42
Author: Sourabh Bajaj 
Authored: Mon Nov 7 18:15:17 2016 -0800
Committer: Robert Bradshaw 
Committed: Tue Nov 15 08:53:06 2016 -0800

--
 sdks/python/apache_beam/runners/dataflow_runner.py | 1 +
 sdks/python/apache_beam/utils/names.py | 1 +
 2 files changed, 2 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/d0e31218/sdks/python/apache_beam/runners/dataflow_runner.py
--
diff --git a/sdks/python/apache_beam/runners/dataflow_runner.py 
b/sdks/python/apache_beam/runners/dataflow_runner.py
index 57867fa..00b466b 100644
--- a/sdks/python/apache_beam/runners/dataflow_runner.py
+++ b/sdks/python/apache_beam/runners/dataflow_runner.py
@@ -515,6 +515,7 @@ class DataflowPipelineRunner(PipelineRunner):
 elif transform.source.format == 'text':
   step.add_property(PropertyNames.FILE_PATTERN, transform.source.path)
 elif transform.source.format == 'bigquery':
+  step.add_property(PropertyNames.BIGQUERY_EXPORT_FORMAT, 'FORMAT_JSON')
   # TODO(silviuc): Add table validation if transform.source.validate.
   if transform.source.table_reference is not None:
 step.add_property(PropertyNames.BIGQUERY_DATASET,

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/d0e31218/sdks/python/apache_beam/utils/names.py
--
diff --git a/sdks/python/apache_beam/utils/names.py 
b/sdks/python/apache_beam/utils/names.py
index be8c92a..3edde3c 100644
--- a/sdks/python/apache_beam/utils/names.py
+++ b/sdks/python/apache_beam/utils/names.py
@@ -46,6 +46,7 @@ class PropertyNames(object):
   BIGQUERY_DATASET = 'dataset'
   BIGQUERY_QUERY = 'bigquery_query'
   BIGQUERY_USE_LEGACY_SQL = 'bigquery_use_legacy_sql'
+  BIGQUERY_EXPORT_FORMAT = 'bigquery_export_format'
   BIGQUERY_TABLE = 'table'
   BIGQUERY_PROJECT = 'project'
   BIGQUERY_SCHEMA = 'schema'



[1/2] incubator-beam git commit: Add IP configuration to Python SDK

2016-11-15 Thread robertwb
Repository: incubator-beam
Updated Branches:
  refs/heads/python-sdk 66f324b35 -> 6ac6e420f


Add IP configuration to Python SDK


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/af6f4e90
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/af6f4e90
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/af6f4e90

Branch: refs/heads/python-sdk
Commit: af6f4e90a239dbf1a2ad6f0cd8974602dfa9e9b4
Parents: 66f324b
Author: Sam McVeety 
Authored: Fri Nov 11 19:56:34 2016 -0800
Committer: Robert Bradshaw 
Committed: Tue Nov 15 08:50:31 2016 -0800

--
 sdks/python/apache_beam/internal/apiclient.py | 9 +
 sdks/python/apache_beam/utils/options.py  | 4 
 2 files changed, 13 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/af6f4e90/sdks/python/apache_beam/internal/apiclient.py
--
diff --git a/sdks/python/apache_beam/internal/apiclient.py 
b/sdks/python/apache_beam/internal/apiclient.py
index 8c7cc29..5ac9d6e 100644
--- a/sdks/python/apache_beam/internal/apiclient.py
+++ b/sdks/python/apache_beam/internal/apiclient.py
@@ -210,6 +210,15 @@ class Environment(object):
 pool.teardownPolicy = (
 dataflow.WorkerPool
 .TeardownPolicyValueValuesEnum.TEARDOWN_ON_SUCCESS)
+if self.worker_options.use_public_ips is not None:
+  if self.worker_options.use_public_ips:
+pool.ipConfiguration = (
+dataflow.WorkerPool
+.IpConfigurationValueValuesEnum.WORKER_IP_PUBLIC)
+  else:
+pool.ipConfiguration = (
+dataflow.WorkerPool
+.IpConfigurationValueValuesEnum.WORKER_IP_PRIVATE)
 
 if self.standard_options.streaming:
   # Use separate data disk for streaming.

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/af6f4e90/sdks/python/apache_beam/utils/options.py
--
diff --git a/sdks/python/apache_beam/utils/options.py 
b/sdks/python/apache_beam/utils/options.py
index ecc85ba..f68335b 100644
--- a/sdks/python/apache_beam/utils/options.py
+++ b/sdks/python/apache_beam/utils/options.py
@@ -348,6 +348,10 @@ class WorkerOptions(PipelineOptions):
 help=
 ('The teardown policy for the VMs. By default this is left unset and '
  'the service sets the default policy.'))
+parser.add_argument(
+'--use_public_ips',
+default=None,
+help='Whether to assign public IP addresses to the worker machines.')
 
   def validate(self, validator):
 errors = []



[2/2] incubator-beam git commit: Closes #1354

2016-11-15 Thread robertwb
Closes #1354


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/6ac6e420
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/6ac6e420
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/6ac6e420

Branch: refs/heads/python-sdk
Commit: 6ac6e420f4f91e3326b07753004d3a56f34b4226
Parents: 66f324b af6f4e9
Author: Robert Bradshaw 
Authored: Tue Nov 15 08:50:32 2016 -0800
Committer: Robert Bradshaw 
Committed: Tue Nov 15 08:50:32 2016 -0800

--
 sdks/python/apache_beam/internal/apiclient.py | 9 +
 sdks/python/apache_beam/utils/options.py  | 4 
 2 files changed, 13 insertions(+)
--




[1/2] incubator-beam git commit: Use batch GCS operations during FileSink write finalization

2016-11-15 Thread robertwb
Repository: incubator-beam
Updated Branches:
  refs/heads/python-sdk 560fe79f8 -> 66f324b35


Use batch GCS operations during FileSink write finalization


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/313191e1
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/313191e1
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/313191e1

Branch: refs/heads/python-sdk
Commit: 313191e129b884e4e14e9f503a757147d368217c
Parents: 560fe79
Author: Charles Chen 
Authored: Thu Nov 10 11:54:08 2016 -0800
Committer: Robert Bradshaw 
Committed: Tue Nov 15 08:48:47 2016 -0800

--
 sdks/python/apache_beam/io/fileio.py  | 177 -
 sdks/python/apache_beam/io/fileio_test.py |   2 +-
 sdks/python/apache_beam/io/gcsio.py   |  78 +++
 sdks/python/apache_beam/io/gcsio_test.py  | 103 +-
 sdks/python/apache_beam/utils/retry.py|   3 +
 5 files changed, 298 insertions(+), 65 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/313191e1/sdks/python/apache_beam/io/fileio.py
--
diff --git a/sdks/python/apache_beam/io/fileio.py 
b/sdks/python/apache_beam/io/fileio.py
index 669bfc9..ef20a7c 100644
--- a/sdks/python/apache_beam/io/fileio.py
+++ b/sdks/python/apache_beam/io/fileio.py
@@ -31,6 +31,7 @@ import zlib
 import weakref
 
 from apache_beam import coders
+from apache_beam.io import gcsio
 from apache_beam.io import iobase
 from apache_beam.io import range_trackers
 from apache_beam.runners.dataflow.native_io import iobase as dataflow_io
@@ -451,8 +452,6 @@ class ChannelFactory(object):
   'was %s' % type(compression_type))
 
 if path.startswith('gs://'):
-  # pylint: disable=wrong-import-order, wrong-import-position
-  from apache_beam.io import gcsio
   raw_file = gcsio.GcsIO().open(
   path,
   mode,
@@ -470,40 +469,92 @@ class ChannelFactory(object):
 return isinstance(fileobj, _CompressedFile)
 
   @staticmethod
-  def rename(src, dst):
+  def rename(src, dest):
 if src.startswith('gs://'):
-  assert dst.startswith('gs://'), dst
-  # pylint: disable=wrong-import-order, wrong-import-position
-  from apache_beam.io import gcsio
-  gcsio.GcsIO().rename(src, dst)
+  if not dest.startswith('gs://'):
+raise ValueError('Destination %r must be GCS path.', dest)
+  gcsio.GcsIO().rename(src, dest)
 else:
   try:
-os.rename(src, dst)
+os.rename(src, dest)
   except OSError as err:
 raise IOError(err)
 
   @staticmethod
-  def copytree(src, dst):
+  def rename_batch(src_dest_pairs):
+# Filter out local and GCS operations.
+local_src_dest_pairs = []
+gcs_src_dest_pairs = []
+for src, dest in src_dest_pairs:
+  if src.startswith('gs://'):
+if not dest.startswith('gs://'):
+  raise ValueError('Destination %r must be GCS path.', dest)
+gcs_src_dest_pairs.append((src, dest))
+  else:
+local_src_dest_pairs.append((src, dest))
+
+# Execute local operations.
+exceptions = []
+for src, dest in local_src_dest_pairs:
+  try:
+ChannelFactory.rename(src, dest)
+  except Exception as e:  # pylint: disable=broad-except
+exceptions.append((src, dest, e))
+
+# Execute GCS operations.
+exceptions += ChannelFactory._rename_gcs_batch(gcs_src_dest_pairs)
+
+return exceptions
+
+  @staticmethod
+  def _rename_gcs_batch(src_dest_pairs):
+# Prepare batches.
+gcs_batches = []
+gcs_current_batch = []
+for src, dest in src_dest_pairs:
+  if len(gcs_current_batch) == gcsio.MAX_BATCH_OPERATION_SIZE:
+gcs_batches.append(gcs_current_batch)
+gcs_current_batch = []
+if gcs_current_batch:
+  gcs_batches.append(gcs_current_batch)
+
+# Execute GCS renames if any and return exceptions.
+exceptions = []
+for batch in gcs_batches:
+  copy_statuses = gcsio.GcsIO().copy_batch(batch)
+  copy_succeeded = []
+  for src, dest, exception in copy_statuses:
+if exception:
+  exceptions.append((src, dest, exception))
+else:
+  copy_succeeded.append((src, dest))
+  delete_batch = [src for src, dest in copy_succeeded]
+  delete_statuses = gcsio.GcsIO().delete_batch(delete_batch)
+  for i, (src, exception) in enumerate(delete_statuses):
+dest = copy_succeeded[i]
+if exception:
+  exceptions.append((src, dest, exception))
+return exceptions
+
+  @staticmethod
+  def copytree(src, dest):
 if src.startswith('gs://'):
-  assert dst.startswith('gs://'), dst
+  if not dest.startswith('gs://'):
+  

[2/2] incubator-beam git commit: Closes #1337

2016-11-15 Thread robertwb
Closes #1337


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/66f324b3
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/66f324b3
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/66f324b3

Branch: refs/heads/python-sdk
Commit: 66f324b350c50720cc88357bc2d56e2ecd99adc8
Parents: 560fe79 313191e
Author: Robert Bradshaw 
Authored: Tue Nov 15 08:48:48 2016 -0800
Committer: Robert Bradshaw 
Committed: Tue Nov 15 08:48:48 2016 -0800

--
 sdks/python/apache_beam/io/fileio.py  | 177 -
 sdks/python/apache_beam/io/fileio_test.py |   2 +-
 sdks/python/apache_beam/io/gcsio.py   |  78 +++
 sdks/python/apache_beam/io/gcsio_test.py  | 103 +-
 sdks/python/apache_beam/utils/retry.py|   3 +
 5 files changed, 298 insertions(+), 65 deletions(-)
--




Jenkins build is back to normal : beam_PostCommit_MavenVerify #1824

2016-11-15 Thread Apache Jenkins Server
See 



[jira] [Commented] (BEAM-918) Let users set STORAGE_LEVEL via SparkPipelineOptions.

2016-11-15 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667407#comment-15667407
 ] 

Jean-Baptiste Onofré commented on BEAM-918:
---

Cool, thanks !

> Let users set STORAGE_LEVEL via SparkPipelineOptions.
> -
>
> Key: BEAM-918
> URL: https://issues.apache.org/jira/browse/BEAM-918
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Amit Sela
>Assignee: Jean-Baptiste Onofré
>
> Spark provides different "STORAGE_LEVEL"s for caching RDDs (disk, memory, 
> ser/de, etc.).
> The runner decides on caching when necessary, for example when a RDD is 
> accessed repeatedly.
>  
> For batch, we can let users set their preferred STORAGE_LEVEL via 
> SparkPipelineOptions.
> Note: for streaming we force a "MEMORY_ONLY", since (among other things) the 
> runner heavily relies on stateful operations, and it's natural for streaming 
> to happen in memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-980) Document how to configure the DAG created by Apex Runner

2016-11-15 Thread Thomas Weise (JIRA)
Thomas Weise created BEAM-980:
-

 Summary: Document how to configure the DAG created by Apex Runner
 Key: BEAM-980
 URL: https://issues.apache.org/jira/browse/BEAM-980
 Project: Beam
  Issue Type: Task
  Components: runner-apex
Reporter: Thomas Weise


The Beam pipeline is translated to an Apex DAG of operators that have names 
that are derived from the transforms. In case of composite transforms those 
look like path names. Apex lets the user configure things like memory, vcores, 
parallelism through properties/attributes that reference the operator names. 
The configuration approach needs to be documented and supplemented with an 
example.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-930) Findbugs doesn't pass in MongoDB

2016-11-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667382#comment-15667382
 ] 

ASF GitHub Bot commented on BEAM-930:
-

GitHub user jbonofre opened a pull request:

https://github.com/apache/incubator-beam/pull/1361

[BEAM-930] Fix findbugs and re-enable Maven plugin in MongoDbIO and 
MongoDbGridFSIO

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jbonofre/incubator-beam BEAM-930

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1361.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1361






> Findbugs doesn't pass in MongoDB
> 
>
> Key: BEAM-930
> URL: https://issues.apache.org/jira/browse/BEAM-930
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Daniel Halperin
>Assignee: Jean-Baptiste Onofré
> Fix For: 0.4.0-incubating
>
>
> {code}
> [INFO] --- findbugs-maven-plugin:3.0.1:check (default) @ 
> beam-sdks-java-io-mongodb ---
> [INFO] BugInstance size is 2
> [INFO] Error size is 0
> [INFO] Total bugs: 2
> [INFO] Boxing/unboxing to parse a primitive 
> org.apache.beam.sdk.io.mongodb.MongoDbIO$BoundedMongoDbSource.getEstimatedSizeBytes(PipelineOptions)
>  [org.apache.beam.sdk.io.mongodb.MongoDbIO$BoundedMongoDbSource] At 
> MongoDbIO.java:[line 227]
> [INFO] Class org.apache.beam.sdk.io.mongodb.MongoDbIO$Write$WriteFn defines 
> non-transient non-serializable instance field client 
> [org.apache.beam.sdk.io.mongodb.MongoDbIO$Write$WriteFn] In MongoDbIO.java
> [INFO]  
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1361: [BEAM-930] Fix findbugs and re-enable Mav...

2016-11-15 Thread jbonofre
GitHub user jbonofre opened a pull request:

https://github.com/apache/incubator-beam/pull/1361

[BEAM-930] Fix findbugs and re-enable Maven plugin in MongoDbIO and 
MongoDbGridFSIO

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jbonofre/incubator-beam BEAM-930

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1361.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1361






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Build failed in Jenkins: beam_PostCommit_MavenVerify #1823

2016-11-15 Thread Dan Halperin
Not your fault -- merge race and stale precommit results. Thanks for
solving the mystery!

More details:

* Added gridfs to mongodb-io
https://github.com/apache/incubator-beam/pull/1324
* Fixed findbugs for a stale version of mongodb-io
https://github.com/apache/incubator-beam/pull/1356

This is why we want a submit queue / merge bot :)

Dan

On Tue, Nov 15, 2016 at 3:36 PM, Jean-Baptiste Onofré 
wrote:

> Actually, the problem comes from the merge of MongoDB GridFS write.
>
> I'm fixing it.
>
> Sorry again.
>
> Regards
> JB
>
> On 11/15/2016 03:27 PM, Dan Halperin wrote:
>
>> Hmm, looks like heck
>> that https://github.com/apache/incubator-beam/pull/1356
>>  broke this despite
>> green precommit. Rolling back and investigating.
>>
>> On Tue, Nov 15, 2016 at 1:50 PM, Apache Jenkins Server
>> > wrote:
>>
>> See
>> > 1823/changes
>> > 1823/changes>>
>>
>> Changes:
>>
>> [ansela] [BEAM-762] Unify spark-runner EvaluationContext and
>>
>> [dhalperi] [BEAM-930] Fix findbugs and re-enable Maven plugin in
>> MongoDbIO
>>
>> --
>> [...truncated 4741 lines...]
>> [INFO]
>> [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce) @
>> beam-sdks-java-io-mongodb ---
>> [INFO]
>> [INFO] --- maven-remote-resources-plugin:1.5:process
>> (process-resource-bundles) @ beam-sdks-java-io-mongodb ---
>> [INFO]
>> [INFO] --- maven-resources-plugin:2.7:resources (default-resources)
>> @ beam-sdks-java-io-mongodb ---
>> [INFO] Using 'UTF-8' encoding to copy filtered resources.
>> [INFO] skip non existing resourceDirectory
>> > ws/sdks/java/io/mongodb/src/main/resources
>> > ws/sdks/java/io/mongodb/src/main/resources>>
>> [INFO] Copying 3 resources
>> [INFO]
>> [INFO] --- maven-compiler-plugin:3.5.1:compile (default-compile) @
>> beam-sdks-java-io-mongodb ---
>> [INFO] Changes detected - recompiling the module!
>> [INFO] Compiling 3 source files to
>> > ws/sdks/java/io/mongodb/target/classes
>> > ws/sdks/java/io/mongodb/target/classes>>
>> [WARNING] bootstrap class path not set in conjunction with -source 1.7
>> [INFO]
>> > ws/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/
>> io/mongodb/MongoDbGridFSIO.java
>> > ws/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/
>> io/mongodb/MongoDbGridFSIO.java>>:
>> > ws/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/
>> io/mongodb/MongoDbGridFSIO.java
>> > ws/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/
>> io/mongodb/MongoDbGridFSIO.java>>
>> uses or overrides a deprecated API.
>> [INFO]
>> > ws/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/
>> io/mongodb/MongoDbGridFSIO.java
>> > ws/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/
>> io/mongodb/MongoDbGridFSIO.java>>:
>> Recompile with -Xlint:deprecation for details.
>> [INFO]
>> > ws/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/
>> io/mongodb/MongoDbGridFSIO.java
>> > ws/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/
>> io/mongodb/MongoDbGridFSIO.java>>:
>> Some input files use unchecked or unsafe operations.
>> [INFO]
>> > ws/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/
>> io/mongodb/MongoDbGridFSIO.java
>> > ws/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/
>> io/mongodb/MongoDbGridFSIO.java>>:
>> Recompile with -Xlint:unchecked for details.
>> [INFO]
>> [INFO] --- maven-resources-plugin:2.7:testResources
>> (default-testResources) @ beam-sdks-java-io-mongodb ---
>> [INFO] Using 'UTF-8' encoding to copy filtered resources.
>> [INFO] skip non existing resourceDirectory
>> > ws/sdks/java/io/mongodb/src/test/resources
>> 

Re: Build failed in Jenkins: beam_PostCommit_MavenVerify #1823

2016-11-15 Thread Jean-Baptiste Onofré

Actually, the problem comes from the merge of MongoDB GridFS write.

I'm fixing it.

Sorry again.

Regards
JB

On 11/15/2016 03:27 PM, Dan Halperin wrote:

Hmm, looks like heck
that https://github.com/apache/incubator-beam/pull/1356
 broke this despite
green precommit. Rolling back and investigating.

On Tue, Nov 15, 2016 at 1:50 PM, Apache Jenkins Server
> wrote:

See
>

Changes:

[ansela] [BEAM-762] Unify spark-runner EvaluationContext and

[dhalperi] [BEAM-930] Fix findbugs and re-enable Maven plugin in
MongoDbIO

--
[...truncated 4741 lines...]
[INFO]
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce) @
beam-sdks-java-io-mongodb ---
[INFO]
[INFO] --- maven-remote-resources-plugin:1.5:process
(process-resource-bundles) @ beam-sdks-java-io-mongodb ---
[INFO]
[INFO] --- maven-resources-plugin:2.7:resources (default-resources)
@ beam-sdks-java-io-mongodb ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory

>
[INFO] Copying 3 resources
[INFO]
[INFO] --- maven-compiler-plugin:3.5.1:compile (default-compile) @
beam-sdks-java-io-mongodb ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 3 source files to

>
[WARNING] bootstrap class path not set in conjunction with -source 1.7
[INFO]

>:

>
uses or overrides a deprecated API.
[INFO]

>:
Recompile with -Xlint:deprecation for details.
[INFO]

>:
Some input files use unchecked or unsafe operations.
[INFO]

>:
Recompile with -Xlint:unchecked for details.
[INFO]
[INFO] --- maven-resources-plugin:2.7:testResources
(default-testResources) @ beam-sdks-java-io-mongodb ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory

>
[INFO] Copying 3 resources
[INFO]
[INFO] --- maven-compiler-plugin:3.5.1:testCompile
(default-testCompile) @ beam-sdks-java-io-mongodb ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 2 source files to

>
[INFO]



Re: Build failed in Jenkins: beam_PostCommit_MavenVerify #1823

2016-11-15 Thread Jean-Baptiste Onofré

H, let me check as well.

Sorry about that.

Regards
JB

On 11/15/2016 03:27 PM, Dan Halperin wrote:

Hmm, looks like heck
that https://github.com/apache/incubator-beam/pull/1356
 broke this despite
green precommit. Rolling back and investigating.

On Tue, Nov 15, 2016 at 1:50 PM, Apache Jenkins Server
> wrote:

See
>

Changes:

[ansela] [BEAM-762] Unify spark-runner EvaluationContext and

[dhalperi] [BEAM-930] Fix findbugs and re-enable Maven plugin in
MongoDbIO

--
[...truncated 4741 lines...]
[INFO]
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce) @
beam-sdks-java-io-mongodb ---
[INFO]
[INFO] --- maven-remote-resources-plugin:1.5:process
(process-resource-bundles) @ beam-sdks-java-io-mongodb ---
[INFO]
[INFO] --- maven-resources-plugin:2.7:resources (default-resources)
@ beam-sdks-java-io-mongodb ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory

>
[INFO] Copying 3 resources
[INFO]
[INFO] --- maven-compiler-plugin:3.5.1:compile (default-compile) @
beam-sdks-java-io-mongodb ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 3 source files to

>
[WARNING] bootstrap class path not set in conjunction with -source 1.7
[INFO]

>:

>
uses or overrides a deprecated API.
[INFO]

>:
Recompile with -Xlint:deprecation for details.
[INFO]

>:
Some input files use unchecked or unsafe operations.
[INFO]

>:
Recompile with -Xlint:unchecked for details.
[INFO]
[INFO] --- maven-resources-plugin:2.7:testResources
(default-testResources) @ beam-sdks-java-io-mongodb ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory

>
[INFO] Copying 3 resources
[INFO]
[INFO] --- maven-compiler-plugin:3.5.1:testCompile
(default-testCompile) @ beam-sdks-java-io-mongodb ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 2 source files to

>
[INFO]



incubator-beam git commit: Revert "Closes #1356"

2016-11-15 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/master deef3faf7 -> bb8887398


Revert "Closes #1356"

This reverts commit deef3faf7df4885395417cb8f6b4aed6ec3d04e1, reversing
changes made to 2bc66f903cdfa328c4bb3546befbaa0f58bdd6fa.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/bb888739
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/bb888739
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/bb888739

Branch: refs/heads/master
Commit: bb888739841103115c7182f63bdc4858f68b298e
Parents: deef3fa
Author: Dan Halperin 
Authored: Tue Nov 15 06:28:35 2016 -0800
Committer: Dan Halperin 
Committed: Tue Nov 15 06:28:35 2016 -0800

--
 sdks/java/io/mongodb/pom.xml   | 13 +
 .../java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java |  4 ++--
 2 files changed, 15 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/bb888739/sdks/java/io/mongodb/pom.xml
--
diff --git a/sdks/java/io/mongodb/pom.xml b/sdks/java/io/mongodb/pom.xml
index 4b100a9..17dc6e7 100644
--- a/sdks/java/io/mongodb/pom.xml
+++ b/sdks/java/io/mongodb/pom.xml
@@ -31,6 +31,19 @@
   IO to read and write on MongoDB.
 
   
+
+  
+
+
+  org.codehaus.mojo
+  findbugs-maven-plugin
+  
+true
+  
+
+  
+
+
 
   
 org.apache.maven.plugins

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/bb888739/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java
--
diff --git 
a/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java
 
b/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java
index 71c017d..2729602 100644
--- 
a/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java
+++ 
b/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java
@@ -224,7 +224,7 @@ public class MongoDbIO {
   BasicDBObject stat = new BasicDBObject();
   stat.append("collStats", spec.collection());
   Document stats = mongoDatabase.runCommand(stat);
-  return Long.parseLong(stats.get("size").toString());
+  return Long.valueOf(stats.get("size").toString());
 }
 
 @Override
@@ -456,7 +456,7 @@ public class MongoDbIO {
 
 private static class WriteFn extends DoFn {
   private final Write spec;
-  private transient MongoClient client;
+  private MongoClient client;
   private List batch;
 
   public WriteFn(Write spec) {



Build failed in Jenkins: beam_PostCommit_MavenVerify #1823

2016-11-15 Thread Apache Jenkins Server
See 

Changes:

[ansela] [BEAM-762] Unify spark-runner EvaluationContext and

[dhalperi] [BEAM-930] Fix findbugs and re-enable Maven plugin in MongoDbIO

--
[...truncated 4741 lines...]
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce) @ 
beam-sdks-java-io-mongodb ---
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (process-resource-bundles) 
@ beam-sdks-java-io-mongodb ---
[INFO] 
[INFO] --- maven-resources-plugin:2.7:resources (default-resources) @ 
beam-sdks-java-io-mongodb ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 

[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-compiler-plugin:3.5.1:compile (default-compile) @ 
beam-sdks-java-io-mongodb ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 3 source files to 

[WARNING] bootstrap class path not set in conjunction with -source 1.7
[INFO] 
:
 

 uses or overrides a deprecated API.
[INFO] 
:
 Recompile with -Xlint:deprecation for details.
[INFO] 
:
 Some input files use unchecked or unsafe operations.
[INFO] 
:
 Recompile with -Xlint:unchecked for details.
[INFO] 
[INFO] --- maven-resources-plugin:2.7:testResources (default-testResources) @ 
beam-sdks-java-io-mongodb ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 

[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-compiler-plugin:3.5.1:testCompile (default-testCompile) @ 
beam-sdks-java-io-mongodb ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 2 source files to 

[INFO] 
:
 

 uses or overrides a deprecated API.
[INFO] 
:
 Recompile with -Xlint:deprecation for details.
[INFO] 
:
 

 uses unchecked or unsafe operations.
[INFO] 
:
 Recompile with -Xlint:unchecked for details.
[INFO] 
[INFO] --- maven-checkstyle-plugin:2.17:check (default) @ 
beam-sdks-java-io-mongodb ---
[INFO] Starting audit...
Audit done.
[INFO] 
[INFO] >>> findbugs-maven-plugin:3.0.1:check (default) > :findbugs @ 
beam-sdks-java-io-mongodb >>>
[INFO] 
[INFO] --- findbugs-maven-plugin:3.0.1:findbugs (findbugs) @ 
beam-sdks-java-io-mongodb ---
[INFO] Fork Value is true
 [java] Warnings generated: 3
[INFO] Done FindBugs Analysis
[INFO] 
[INFO] <<< findbugs-maven-plugin:3.0.1:check (default) < :findbugs @ 
beam-sdks-java-io-mongodb <<<
[INFO] 
[INFO] --- findbugs-maven-plugin:3.0.1:check (default) @ 
beam-sdks-java-io-mongodb ---
[INFO] BugInstance size is 3
[INFO] Error size is 0
[INFO] Total bugs: 3
[INFO] Class org.apache.beam.sdk.io.mongodb.MongoDbGridFSIO$GridFsWriteFn 
defines non-transient non-serializable instance field gridFsFile 
[org.apache.beam.sdk.io.mongodb.MongoDbGridFSIO$GridFsWriteFn] In 
MongoDbGridFSIO.java
[INFO] Class org.apache.beam.sdk.io.mongodb.MongoDbGridFSIO$GridFsWriteFn 
defines 

[jira] [Commented] (BEAM-979) ConcurrentModificationException exception after hours of running

2016-11-15 Thread Stas Levin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667018#comment-15667018
 ] 

Stas Levin commented on BEAM-979:
-

I believe 
{{scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:42)}} 
implies that it's a Beam issue rather than a Spark issue, since we're seeing a 
Java-Scala interop.

I think the only place {{spark-runner}} actually creates and {{RDD}} is when 
reading from a source, in our case {{SparkUnboundedSource}}. After hitting 
"Cmd+B" enough times I found the following block to be of particular interest:

{code:java}
Iterator> mapWithStateDStream = 
inputDStream.mapWithState(StateSpec.function(StateSpecFunctions.mapSourceFunction(rc)));
{code}

Following the stacktrace plus realising that the {{iterator}} being modified is 
actually a {{java.util.ArrayList}} also leads me to 
{{StateSpecFunctions#mapSourceFunction}} which returns an {{iterator}} which is 
in fact an {{ArrayList}} just like the stacktrace says. 


> ConcurrentModificationException exception after hours of running
> 
>
> Key: BEAM-979
> URL: https://issues.apache.org/jira/browse/BEAM-979
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Stas Levin
>Assignee: Stas Levin
>
> {code}
>   
> User class threw exception: org.apache.spark.SparkException: Job aborted due 
> to stage failure: Task 1 in stage 4483.0 failed 4 times, most recent failure: 
> Lost task 1.3 in stage 4483.0 (TID 44548, .com): 
> java.util.ConcurrentModificationException
> at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901)
> at java.util.ArrayList$Itr.next(ArrayList.java:851)
> at 
> com.xxx.relocated.com.google.common.collect.Iterators$3.next(Iterators.java:177)
> at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:42)
> at scala.collection.Iterator$$anon$13.next(Iterator.scala:372)
> at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:285)
> at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:275)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> at org.apache.spark.scheduler.Task.run(Task.scala:89)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (BEAM-979) ConcurrentModificationException exception after hours of running

2016-11-15 Thread Stas Levin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666975#comment-15666975
 ] 

Stas Levin edited comment on BEAM-979 at 11/15/16 12:24 PM:


I think it may have to do with {{StateSpecFunctions#mapSourceFunction}}, where 
we have this:

{code:java}

// read microbatch.
final List readValues = new ArrayList<>();

// ...

while (!finished) {
readValues.add(WindowedValue.of(reader.getCurrent(), 
reader.getCurrentTimestamp(), GlobalWindow.INSTANCE, PaneInfo.NO_FIRING));
finished = !reader.advance();
  }

// ...

return Iterators.unmodifiableIterator(readValues.iterator());

{code}

I think there may be a scenario where {{readValues}} may be changed and read 
concurrently, which causes a {{ConcurrentModificationException}} to be thrown 
upon invoking the {{iterator#next}} we see in the stacktrace. 

One of the workarounds I'm trying is to change {{final List 
readValues = new ArrayList<>();}} to {{final Collection 
readValues = new ConcurrentLinkedQueue<>();}}. Ever since making this change 
the exception has not reoccured, but given its nature it's too soon to tell for 
sure.

Moreover, this change may prevent from a {{ConcurrentModificationException}} 
being thrown, but we still need to consider if it being thrown is merely a 
symptom of some scenario being improperly handled rather than a problem per-se.


was (Author: staslev):
I think it may have to do with {StateSpecFunctions#mapSourceFunction}}, where 
we have this:

{code:java}

// read microbatch.
final List readValues = new ArrayList<>();

// ...

while (!finished) {
readValues.add(WindowedValue.of(reader.getCurrent(), 
reader.getCurrentTimestamp(), GlobalWindow.INSTANCE, PaneInfo.NO_FIRING));
finished = !reader.advance();
  }

// ...

return Iterators.unmodifiableIterator(readValues.iterator());

{code}

I think there may be a scenario where {{readValues}} may be changed and read 
concurrently, which causes a {{ConcurrentModificationException}} to be thrown 
upon invoking the {{iterator#next}} we see in the stacktrace. 

One of the workarounds I'm trying is to change {{final List 
readValues = new ArrayList<>();}} to {{final Collection 
readValues = new ConcurrentLinkedQueue<>();}}. Ever since making this change 
the exception has not reoccured, but given its nature it's too soon to tell for 
sure.

Moreover, this change may prevent from a {{ConcurrentModificationException}} 
being thrown, but we still need to consider if it being thrown is merely a 
symptom of some scenario being improperly handled rather than a problem per-se.

> ConcurrentModificationException exception after hours of running
> 
>
> Key: BEAM-979
> URL: https://issues.apache.org/jira/browse/BEAM-979
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Stas Levin
>Assignee: Stas Levin
>
> {code}
>   
> User class threw exception: org.apache.spark.SparkException: Job aborted due 
> to stage failure: Task 1 in stage 4483.0 failed 4 times, most recent failure: 
> Lost task 1.3 in stage 4483.0 (TID 44548, .com): 
> java.util.ConcurrentModificationException
> at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901)
> at java.util.ArrayList$Itr.next(ArrayList.java:851)
> at 
> com.xxx.relocated.com.google.common.collect.Iterators$3.next(Iterators.java:177)
> at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:42)
> at scala.collection.Iterator$$anon$13.next(Iterator.scala:372)
> at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:285)
> at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:275)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
> at 

[jira] [Commented] (BEAM-979) ConcurrentModificationException exception after hours of running

2016-11-15 Thread Amit Sela (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666994#comment-15666994
 ] 

Amit Sela commented on BEAM-979:


If this is indeed what happens, the exception is just a symptom..
What makes you think it's the read step ?
We read here until we're done ({{while}} is blocking), and to further protect 
from such scenario I'm returning {{Iterators.unmodifiableIterator}}.

> ConcurrentModificationException exception after hours of running
> 
>
> Key: BEAM-979
> URL: https://issues.apache.org/jira/browse/BEAM-979
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Stas Levin
>Assignee: Stas Levin
>
> {code}
>   
> User class threw exception: org.apache.spark.SparkException: Job aborted due 
> to stage failure: Task 1 in stage 4483.0 failed 4 times, most recent failure: 
> Lost task 1.3 in stage 4483.0 (TID 44548, .com): 
> java.util.ConcurrentModificationException
> at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901)
> at java.util.ArrayList$Itr.next(ArrayList.java:851)
> at 
> com.xxx.relocated.com.google.common.collect.Iterators$3.next(Iterators.java:177)
> at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:42)
> at scala.collection.Iterator$$anon$13.next(Iterator.scala:372)
> at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:285)
> at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:275)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> at org.apache.spark.scheduler.Task.run(Task.scala:89)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-930) Findbugs doesn't pass in MongoDB

2016-11-15 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin resolved BEAM-930.
--
   Resolution: Fixed
Fix Version/s: 0.4.0-incubating

> Findbugs doesn't pass in MongoDB
> 
>
> Key: BEAM-930
> URL: https://issues.apache.org/jira/browse/BEAM-930
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Daniel Halperin
>Assignee: Jean-Baptiste Onofré
> Fix For: 0.4.0-incubating
>
>
> {code}
> [INFO] --- findbugs-maven-plugin:3.0.1:check (default) @ 
> beam-sdks-java-io-mongodb ---
> [INFO] BugInstance size is 2
> [INFO] Error size is 0
> [INFO] Total bugs: 2
> [INFO] Boxing/unboxing to parse a primitive 
> org.apache.beam.sdk.io.mongodb.MongoDbIO$BoundedMongoDbSource.getEstimatedSizeBytes(PipelineOptions)
>  [org.apache.beam.sdk.io.mongodb.MongoDbIO$BoundedMongoDbSource] At 
> MongoDbIO.java:[line 227]
> [INFO] Class org.apache.beam.sdk.io.mongodb.MongoDbIO$Write$WriteFn defines 
> non-transient non-serializable instance field client 
> [org.apache.beam.sdk.io.mongodb.MongoDbIO$Write$WriteFn] In MongoDbIO.java
> [INFO]  
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-979) ConcurrentModificationException exception after hours of running

2016-11-15 Thread Stas Levin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666975#comment-15666975
 ] 

Stas Levin commented on BEAM-979:
-

I think it may have to do with {StateSpecFunctions#mapSourceFunction}}, where 
we have this:

{code:java}

// read microbatch.
final List readValues = new ArrayList<>();

// ...

while (!finished) {
readValues.add(WindowedValue.of(reader.getCurrent(), 
reader.getCurrentTimestamp(), GlobalWindow.INSTANCE, PaneInfo.NO_FIRING));
finished = !reader.advance();
  }

// ...

return Iterators.unmodifiableIterator(readValues.iterator());

{code}

I think there may be a scenario where {{readValues}} may be changed and read 
concurrently, which causes a {{ConcurrentModificationException}} to be thrown 
upon invoking the {{iterator#next}} we see in the stacktrace. 

One of the workarounds I'm trying is to change {{final List 
readValues = new ArrayList<>();}} to {{final Collection 
readValues = new ConcurrentLinkedQueue<>();}}. Ever since making this change 
the exception has not reoccured, but given its nature it's too soon to tell for 
sure.

Moreover, this change may prevent from a {{ConcurrentModificationException}} 
being thrown, but we still need to consider if it being thrown is merely a 
symptom of some scenario being improperly handled rather than a problem per-se.

> ConcurrentModificationException exception after hours of running
> 
>
> Key: BEAM-979
> URL: https://issues.apache.org/jira/browse/BEAM-979
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Stas Levin
>Assignee: Stas Levin
>
> {code}
>   
> User class threw exception: org.apache.spark.SparkException: Job aborted due 
> to stage failure: Task 1 in stage 4483.0 failed 4 times, most recent failure: 
> Lost task 1.3 in stage 4483.0 (TID 44548, .com): 
> java.util.ConcurrentModificationException
> at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901)
> at java.util.ArrayList$Itr.next(ArrayList.java:851)
> at 
> com.xxx.relocated.com.google.common.collect.Iterators$3.next(Iterators.java:177)
> at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:42)
> at scala.collection.Iterator$$anon$13.next(Iterator.scala:372)
> at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:285)
> at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:275)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> at org.apache.spark.scheduler.Task.run(Task.scala:89)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1356: [BEAM-930] Fix findbugs and re-enable Mav...

2016-11-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1356


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] incubator-beam git commit: [BEAM-930] Fix findbugs and re-enable Maven plugin in MongoDbIO

2016-11-15 Thread dhalperi
[BEAM-930] Fix findbugs and re-enable Maven plugin in MongoDbIO


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/7c87c662
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/7c87c662
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/7c87c662

Branch: refs/heads/master
Commit: 7c87c662db99e581e28e3198c90d2f43a8eebe6d
Parents: 2bc66f9
Author: Jean-Baptiste Onofré 
Authored: Mon Nov 14 16:10:53 2016 +0100
Committer: Dan Halperin 
Committed: Tue Nov 15 04:02:08 2016 -0800

--
 sdks/java/io/mongodb/pom.xml   | 13 -
 .../java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java |  4 ++--
 2 files changed, 2 insertions(+), 15 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/7c87c662/sdks/java/io/mongodb/pom.xml
--
diff --git a/sdks/java/io/mongodb/pom.xml b/sdks/java/io/mongodb/pom.xml
index 17dc6e7..4b100a9 100644
--- a/sdks/java/io/mongodb/pom.xml
+++ b/sdks/java/io/mongodb/pom.xml
@@ -31,19 +31,6 @@
   IO to read and write on MongoDB.
 
   
-
-  
-
-
-  org.codehaus.mojo
-  findbugs-maven-plugin
-  
-true
-  
-
-  
-
-
 
   
 org.apache.maven.plugins

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/7c87c662/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java
--
diff --git 
a/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java
 
b/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java
index 2729602..71c017d 100644
--- 
a/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java
+++ 
b/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java
@@ -224,7 +224,7 @@ public class MongoDbIO {
   BasicDBObject stat = new BasicDBObject();
   stat.append("collStats", spec.collection());
   Document stats = mongoDatabase.runCommand(stat);
-  return Long.valueOf(stats.get("size").toString());
+  return Long.parseLong(stats.get("size").toString());
 }
 
 @Override
@@ -456,7 +456,7 @@ public class MongoDbIO {
 
 private static class WriteFn extends DoFn {
   private final Write spec;
-  private MongoClient client;
+  private transient MongoClient client;
   private List batch;
 
   public WriteFn(Write spec) {



[1/2] incubator-beam git commit: Closes #1356

2016-11-15 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/master 2bc66f903 -> deef3faf7


Closes #1356


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/deef3faf
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/deef3faf
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/deef3faf

Branch: refs/heads/master
Commit: deef3faf7df4885395417cb8f6b4aed6ec3d04e1
Parents: 2bc66f9 7c87c66
Author: Dan Halperin 
Authored: Tue Nov 15 04:02:08 2016 -0800
Committer: Dan Halperin 
Committed: Tue Nov 15 04:02:08 2016 -0800

--
 sdks/java/io/mongodb/pom.xml   | 13 -
 .../java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java |  4 ++--
 2 files changed, 2 insertions(+), 15 deletions(-)
--




[jira] [Commented] (BEAM-918) Let users set STORAGE_LEVEL via SparkPipelineOptions.

2016-11-15 Thread Amit Sela (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666959#comment-15666959
 ] 

Amit Sela commented on BEAM-918:


[~jbonofre] you're good to go ;)

> Let users set STORAGE_LEVEL via SparkPipelineOptions.
> -
>
> Key: BEAM-918
> URL: https://issues.apache.org/jira/browse/BEAM-918
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Amit Sela
>Assignee: Jean-Baptiste Onofré
>
> Spark provides different "STORAGE_LEVEL"s for caching RDDs (disk, memory, 
> ser/de, etc.).
> The runner decides on caching when necessary, for example when a RDD is 
> accessed repeatedly.
>  
> For batch, we can let users set their preferred STORAGE_LEVEL via 
> SparkPipelineOptions.
> Note: for streaming we force a "MEMORY_ONLY", since (among other things) the 
> runner heavily relies on stateful operations, and it's natural for streaming 
> to happen in memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-762) Use composition over inheritance in spark StreamingEvaluationContext if two contexts are necessary.

2016-11-15 Thread Amit Sela (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Sela resolved BEAM-762.

   Resolution: Fixed
Fix Version/s: 0.4.0-incubating

> Use composition over inheritance in spark StreamingEvaluationContext if two 
> contexts are necessary.
> ---
>
> Key: BEAM-762
> URL: https://issues.apache.org/jira/browse/BEAM-762
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Pei He
>Assignee: Aviem Zur
>Priority: Minor
>  Labels: starter
> Fix For: 0.4.0-incubating
>
>
> After working on PR: https://github.com/apache/incubator-beam/pull/1096 ,
> I felt it is easy to forget updating spark streaming context with current 
> inheritance.
> And, having a single EvaluationContext to support both streaming and batch is 
> even better.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1291: [BEAM-762] Unify spark-runner EvaluationC...

2016-11-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1291


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] incubator-beam git commit: This closes #1291

2016-11-15 Thread amitsela
This closes #1291


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/2bc66f90
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/2bc66f90
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/2bc66f90

Branch: refs/heads/master
Commit: 2bc66f903cdfa328c4bb3546befbaa0f58bdd6fa
Parents: 9c300cd 1bef01f
Author: Sela 
Authored: Tue Nov 15 13:37:20 2016 +0200
Committer: Sela 
Committed: Tue Nov 15 13:37:20 2016 +0200

--
 .../apache/beam/runners/spark/SparkRunner.java  |   4 +-
 .../spark/translation/BoundedDataset.java   | 114 
 .../beam/runners/spark/translation/Dataset.java |  34 +++
 .../spark/translation/EvaluationContext.java| 230 +++-
 .../spark/translation/TransformTranslator.java  |  99 +++
 .../SparkRunnerStreamingContextFactory.java |   7 +-
 .../streaming/StreamingEvaluationContext.java   | 272 ---
 .../streaming/StreamingTransformTranslator.java | 135 +
 .../translation/streaming/UnboundedDataset.java | 103 +++
 9 files changed, 464 insertions(+), 534 deletions(-)
--




[1/2] incubator-beam git commit: [BEAM-762] Unify spark-runner EvaluationContext and StreamingEvaluationContext

2016-11-15 Thread amitsela
Repository: incubator-beam
Updated Branches:
  refs/heads/master 9c300cde8 -> 2bc66f903


[BEAM-762] Unify spark-runner EvaluationContext and StreamingEvaluationContext

PR 1291 review changes.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/1bef01fe
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/1bef01fe
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/1bef01fe

Branch: refs/heads/master
Commit: 1bef01fef5ff5ff9a960c85b00c2cc4aa504ce4d
Parents: 9c300cd
Author: Aviem Zur 
Authored: Sun Nov 13 13:57:07 2016 +0200
Committer: Sela 
Committed: Tue Nov 15 13:35:49 2016 +0200

--
 .../apache/beam/runners/spark/SparkRunner.java  |   4 +-
 .../spark/translation/BoundedDataset.java   | 114 
 .../beam/runners/spark/translation/Dataset.java |  34 +++
 .../spark/translation/EvaluationContext.java| 230 +++-
 .../spark/translation/TransformTranslator.java  |  99 +++
 .../SparkRunnerStreamingContextFactory.java |   7 +-
 .../streaming/StreamingEvaluationContext.java   | 272 ---
 .../streaming/StreamingTransformTranslator.java | 135 +
 .../translation/streaming/UnboundedDataset.java | 103 +++
 9 files changed, 464 insertions(+), 534 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/1bef01fe/runners/spark/src/main/java/org/apache/beam/runners/spark/SparkRunner.java
--
diff --git 
a/runners/spark/src/main/java/org/apache/beam/runners/spark/SparkRunner.java 
b/runners/spark/src/main/java/org/apache/beam/runners/spark/SparkRunner.java
index 45c7f55..6bbef39 100644
--- a/runners/spark/src/main/java/org/apache/beam/runners/spark/SparkRunner.java
+++ b/runners/spark/src/main/java/org/apache/beam/runners/spark/SparkRunner.java
@@ -26,7 +26,6 @@ import 
org.apache.beam.runners.spark.translation.SparkPipelineTranslator;
 import org.apache.beam.runners.spark.translation.TransformEvaluator;
 import org.apache.beam.runners.spark.translation.TransformTranslator;
 import 
org.apache.beam.runners.spark.translation.streaming.SparkRunnerStreamingContextFactory;
-import 
org.apache.beam.runners.spark.translation.streaming.StreamingEvaluationContext;
 import org.apache.beam.sdk.Pipeline;
 import org.apache.beam.sdk.options.PipelineOptions;
 import org.apache.beam.sdk.options.PipelineOptionsFactory;
@@ -49,6 +48,7 @@ import 
org.apache.spark.streaming.api.java.JavaStreamingContext;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
+
 /**
  * The SparkRunner translate operations defined on a pipeline to a 
representation
  * executable by Spark, and then submitting the job to Spark to be executed. 
If we wanted to run
@@ -136,7 +136,7 @@ public final class SparkRunner extends 
PipelineRunner {
 jssc.start();
 
 // if recovering from checkpoint, we have to reconstruct the 
EvaluationResult instance.
-return contextFactory.getCtxt() == null ? new 
StreamingEvaluationContext(jssc.sc(),
+return contextFactory.getCtxt() == null ? new 
EvaluationContext(jssc.sc(),
 pipeline, jssc, mOptions.getTimeout()) : contextFactory.getCtxt();
   } else {
 if (mOptions.getTimeout() > 0) {

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/1bef01fe/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/BoundedDataset.java
--
diff --git 
a/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/BoundedDataset.java
 
b/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/BoundedDataset.java
new file mode 100644
index 000..774efb9
--- /dev/null
+++ 
b/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/BoundedDataset.java
@@ -0,0 +1,114 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.runners.spark.translation;
+

[jira] [Commented] (BEAM-762) Use composition over inheritance in spark StreamingEvaluationContext if two contexts are necessary.

2016-11-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666948#comment-15666948
 ] 

ASF GitHub Bot commented on BEAM-762:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1291


> Use composition over inheritance in spark StreamingEvaluationContext if two 
> contexts are necessary.
> ---
>
> Key: BEAM-762
> URL: https://issues.apache.org/jira/browse/BEAM-762
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Pei He
>Priority: Minor
>  Labels: starter
>
> After working on PR: https://github.com/apache/incubator-beam/pull/1096 ,
> I felt it is easy to forget updating spark streaming context with current 
> inheritance.
> And, having a single EvaluationContext to support both streaming and batch is 
> even better.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-979) ConcurrentModificationException exception after hours of running

2016-11-15 Thread Amit Sela (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666904#comment-15666904
 ] 

Amit Sela commented on BEAM-979:


[~staslev] what pipeline did you execute ? is there any other information about 
the step it occurs in ? 

> ConcurrentModificationException exception after hours of running
> 
>
> Key: BEAM-979
> URL: https://issues.apache.org/jira/browse/BEAM-979
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Stas Levin
>Assignee: Stas Levin
>
> {code}
>   
> User class threw exception: org.apache.spark.SparkException: Job aborted due 
> to stage failure: Task 1 in stage 4483.0 failed 4 times, most recent failure: 
> Lost task 1.3 in stage 4483.0 (TID 44548, .com): 
> java.util.ConcurrentModificationException
> at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901)
> at java.util.ArrayList$Itr.next(ArrayList.java:851)
> at 
> com.xxx.relocated.com.google.common.collect.Iterators$3.next(Iterators.java:177)
> at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:42)
> at scala.collection.Iterator$$anon$13.next(Iterator.scala:372)
> at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:285)
> at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:275)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> at org.apache.spark.scheduler.Task.run(Task.scala:89)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-979) ConcurrentModificationException exception after hours of running

2016-11-15 Thread Stas Levin (JIRA)
Stas Levin created BEAM-979:
---

 Summary: ConcurrentModificationException exception after hours of 
running
 Key: BEAM-979
 URL: https://issues.apache.org/jira/browse/BEAM-979
 Project: Beam
  Issue Type: Bug
  Components: runner-spark
Reporter: Stas Levin
Assignee: Stas Levin


{code}

User class threw exception: org.apache.spark.SparkException: Job aborted due to 
stage failure: Task 1 in stage 4483.0 failed 4 times, most recent failure: Lost 
task 1.3 in stage 4483.0 (TID 44548, .com): 
java.util.ConcurrentModificationException
at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901)
at java.util.ArrayList$Itr.next(ArrayList.java:851)
at 
com.xxx.relocated.com.google.common.collect.Iterators$3.next(Iterators.java:177)
at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:42)
at scala.collection.Iterator$$anon$13.next(Iterator.scala:372)
at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:285)
at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:275)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:313)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:277)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-948) Add ability to write/sink results to MongoDBGridfs

2016-11-15 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Baptiste Onofré updated BEAM-948:
--
Fix Version/s: 0.4.0-incubating

> Add ability to write/sink results to MongoDBGridfs
> --
>
> Key: BEAM-948
> URL: https://issues.apache.org/jira/browse/BEAM-948
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Daniel Kulp
>Assignee: Daniel Kulp
> Fix For: 0.4.0-incubating
>
>
> 0.3 added the ability to read files from gridfs for processing.   It would be 
> good to have the Sink side working to allow writing the results there as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >