[jira] [Created] (BEAM-1110) Propagate operator- and pipeline-level metadata where possible

2016-12-07 Thread Daniel Halperin (JIRA)
Daniel Halperin created BEAM-1110:
-

 Summary: Propagate operator- and pipeline-level metadata where 
possible
 Key: BEAM-1110
 URL: https://issues.apache.org/jira/browse/BEAM-1110
 Project: Beam
  Issue Type: Improvement
  Components: runner-apex
Reporter: Daniel Halperin
Priority: Minor


User metadata from Beam pipelines:

* Aggregators (/ Metrics)
* Step names
* Pipeline options

It would be nice to propagate these metadata to the REST APIs served by the 
stram. From talking to [~sandeepdeshmukh], this seems doable but probably not 
done yet.

Supporting these features will give Beam pipelines a consistent experience 
across runners. See also this [blog 
post|https://cloud.google.com/blog/big-data/2016/06/dataflow-updates-see-more-details-about-your-pipelines]
 for how this is realized in the Cloud Dataflow runner, and a related JIRA 
(BEAM-1107) for the Flink Runner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-27) Add user-ready API for interacting with timers

2016-12-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-27?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15731074#comment-15731074
 ] 

ASF GitHub Bot commented on BEAM-27:


GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/1550

[BEAM-27] Reject timers for ParDo in each runner separately and exclude 
timer tests

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

There are many trivial commits spread across the runners, in two phases.

1. Add a JUnit category for timers and exclude it from all runners. I've 
included one trivial test just to see that it is working. (FWIW I have hacked 
the direct runner enough in later commits to get this test to pass "for the 
wrong reasons")
2. Remove the timer rejection code from `ParDo` and add it to every runner.

R: @tgroh @aljoscha @amitsela @tweise 

I'll wait for all four to LGTM.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam UsesTimers

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1550.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1550


commit 5c99d2c3a573c3963fd6efa9e11ed860d26f0365
Author: Kenneth Knowles 
Date:   2016-12-07T04:49:15Z

Add JUnit category UsesTimersInParDo

With this, various runners can disable tests for this capability
until they support it.

commit 604695f1b7cbd6dfeda94d82532fde535b3e3448
Author: Kenneth Knowles 
Date:   2016-12-07T04:52:24Z

Disable tests for timers in ParDo for Apex runner

commit 58cac2f4960044e646249eb8a3db27b345e1
Author: Kenneth Knowles 
Date:   2016-12-07T04:52:49Z

Disable tests for timers in ParDo for Flink runner

commit f27bdf9b3c2e609c54615db60f0682b34f6575c5
Author: Kenneth Knowles 
Date:   2016-12-07T04:53:05Z

Disables tests for timers in ParDo for Spark runner

commit be9c04bf0f2af7b428a61d3b74b68ea18c4f0e03
Author: Kenneth Knowles 
Date:   2016-12-07T04:53:16Z

Disable tests for timers in ParDo for Dataflow runner

commit b66c5cb4f92e2c0d834f44ba246b8186cc9d0291
Author: Kenneth Knowles 
Date:   2016-12-08T04:24:34Z

Disable tests for timers in ParDo for direct runner

commit e4f062495878328359234da4e586d12cb42965f9
Author: Kenneth Knowles 
Date:   2016-12-07T04:49:40Z

Add basic test for timers in ParDoTest

commit ad4cbbc18ffbbedaaa17415220d7ff2908931797
Author: Kenneth Knowles 
Date:   2016-12-08T04:04:51Z

No longer reject timers in ParDo

commit 0703093767db2776965384b09b322f01f4b79b39
Author: Kenneth Knowles 
Date:   2016-12-08T04:34:34Z

Reject timers for ParDo in ApexRunner

commit 4b312e8d01e324db77e5f2de62eae35efc153adb
Author: Kenneth Knowles 
Date:   2016-12-08T04:34:59Z

Reject timers for ParDo in FlinkRunner

commit 2e2bc885deb5000a269800bdbead8f17dff5a3a9
Author: Kenneth Knowles 
Date:   2016-12-08T04:35:08Z

Reject timers for ParDo in SparkRunner

commit 2d2d83f721c21f681f431974edc4e044d44828c5
Author: Kenneth Knowles 
Date:   2016-12-08T04:37:33Z

Reject timers for ParDo in DirectRunner




> Add user-ready API for interacting with timers
> --
>
> Key: BEAM-27
> URL: https://issues.apache.org/jira/browse/BEAM-27
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>
> Pipeline authors will benefit from a different factorization of interaction 
> with underlying timers. The current APIs are targeted at runner implementers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1550: [BEAM-27] Reject timers for ParDo in each...

2016-12-07 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/1550

[BEAM-27] Reject timers for ParDo in each runner separately and exclude 
timer tests

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

There are many trivial commits spread across the runners, in two phases.

1. Add a JUnit category for timers and exclude it from all runners. I've 
included one trivial test just to see that it is working. (FWIW I have hacked 
the direct runner enough in later commits to get this test to pass "for the 
wrong reasons")
2. Remove the timer rejection code from `ParDo` and add it to every runner.

R: @tgroh @aljoscha @amitsela @tweise 

I'll wait for all four to LGTM.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam UsesTimers

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1550.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1550


commit 5c99d2c3a573c3963fd6efa9e11ed860d26f0365
Author: Kenneth Knowles 
Date:   2016-12-07T04:49:15Z

Add JUnit category UsesTimersInParDo

With this, various runners can disable tests for this capability
until they support it.

commit 604695f1b7cbd6dfeda94d82532fde535b3e3448
Author: Kenneth Knowles 
Date:   2016-12-07T04:52:24Z

Disable tests for timers in ParDo for Apex runner

commit 58cac2f4960044e646249eb8a3db27b345e1
Author: Kenneth Knowles 
Date:   2016-12-07T04:52:49Z

Disable tests for timers in ParDo for Flink runner

commit f27bdf9b3c2e609c54615db60f0682b34f6575c5
Author: Kenneth Knowles 
Date:   2016-12-07T04:53:05Z

Disables tests for timers in ParDo for Spark runner

commit be9c04bf0f2af7b428a61d3b74b68ea18c4f0e03
Author: Kenneth Knowles 
Date:   2016-12-07T04:53:16Z

Disable tests for timers in ParDo for Dataflow runner

commit b66c5cb4f92e2c0d834f44ba246b8186cc9d0291
Author: Kenneth Knowles 
Date:   2016-12-08T04:24:34Z

Disable tests for timers in ParDo for direct runner

commit e4f062495878328359234da4e586d12cb42965f9
Author: Kenneth Knowles 
Date:   2016-12-07T04:49:40Z

Add basic test for timers in ParDoTest

commit ad4cbbc18ffbbedaaa17415220d7ff2908931797
Author: Kenneth Knowles 
Date:   2016-12-08T04:04:51Z

No longer reject timers in ParDo

commit 0703093767db2776965384b09b322f01f4b79b39
Author: Kenneth Knowles 
Date:   2016-12-08T04:34:34Z

Reject timers for ParDo in ApexRunner

commit 4b312e8d01e324db77e5f2de62eae35efc153adb
Author: Kenneth Knowles 
Date:   2016-12-08T04:34:59Z

Reject timers for ParDo in FlinkRunner

commit 2e2bc885deb5000a269800bdbead8f17dff5a3a9
Author: Kenneth Knowles 
Date:   2016-12-08T04:35:08Z

Reject timers for ParDo in SparkRunner

commit 2d2d83f721c21f681f431974edc4e044d44828c5
Author: Kenneth Knowles 
Date:   2016-12-08T04:37:33Z

Reject timers for ParDo in DirectRunner




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-25) Add user-ready API for interacting with state

2016-12-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-25?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15731043#comment-15731043
 ] 

ASF GitHub Bot commented on BEAM-25:


Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1543


> Add user-ready API for interacting with state
> -
>
> Key: BEAM-25
> URL: https://issues.apache.org/jira/browse/BEAM-25
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>  Labels: State
>
> Our current state API is targeted at runner implementers, not pipeline 
> authors. As such it has many capabilities that are not necessary nor 
> desirable for simple use cases of stateful ParDo (such as dynamic state tag 
> creation). Implement a simple state intended for user access.
> (Details of our current thoughts in forthcoming design doc)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[3/3] incubator-beam git commit: Move CopyOnAccessStateInternals to runners/direct

2016-12-07 Thread kenn
Move CopyOnAccessStateInternals to runners/direct


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/09e2f309
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/09e2f309
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/09e2f309

Branch: refs/heads/master
Commit: 09e2f309c9554f58d2b9a2be5f83ef2751d9d40b
Parents: 7729594
Author: Kenneth Knowles 
Authored: Wed Dec 7 14:28:39 2016 -0800
Committer: Kenneth Knowles 
Committed: Wed Dec 7 20:21:27 2016 -0800

--
 .../CopyOnAccessInMemoryStateInternals.java | 467 +++
 .../runners/direct/DirectExecutionContext.java  |   1 -
 .../beam/runners/direct/EvaluationContext.java  |   1 -
 .../GroupAlsoByWindowEvaluatorFactory.java  |   1 -
 .../beam/runners/direct/ParDoEvaluator.java |   1 -
 .../runners/direct/StepTransformResult.java |   1 -
 .../beam/runners/direct/TransformResult.java|   1 -
 .../CopyOnAccessInMemoryStateInternalsTest.java | 562 +++
 .../runners/direct/EvaluationContextTest.java   |   1 -
 .../StatefulParDoEvaluatorFactoryTest.java  |   1 -
 .../CopyOnAccessInMemoryStateInternals.java | 453 ---
 .../sdk/util/state/InMemoryStateInternals.java  |  33 +-
 .../CopyOnAccessInMemoryStateInternalsTest.java | 552 --
 13 files changed, 1054 insertions(+), 1021 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/09e2f309/runners/direct-java/src/main/java/org/apache/beam/runners/direct/CopyOnAccessInMemoryStateInternals.java
--
diff --git 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/CopyOnAccessInMemoryStateInternals.java
 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/CopyOnAccessInMemoryStateInternals.java
new file mode 100644
index 000..e486a75
--- /dev/null
+++ 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/CopyOnAccessInMemoryStateInternals.java
@@ -0,0 +1,467 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.direct;
+
+import static com.google.common.base.Preconditions.checkState;
+
+import com.google.common.base.Optional;
+import com.google.common.collect.Iterables;
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Map;
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.transforms.Combine.CombineFn;
+import org.apache.beam.sdk.transforms.Combine.KeyedCombineFn;
+import 
org.apache.beam.sdk.transforms.CombineWithContext.KeyedCombineFnWithContext;
+import org.apache.beam.sdk.transforms.windowing.BoundedWindow;
+import org.apache.beam.sdk.transforms.windowing.OutputTimeFn;
+import org.apache.beam.sdk.util.CombineFnUtil;
+import org.apache.beam.sdk.util.state.AccumulatorCombiningState;
+import org.apache.beam.sdk.util.state.BagState;
+import org.apache.beam.sdk.util.state.InMemoryStateInternals.InMemoryBag;
+import 
org.apache.beam.sdk.util.state.InMemoryStateInternals.InMemoryCombiningValue;
+import org.apache.beam.sdk.util.state.InMemoryStateInternals.InMemoryState;
+import 
org.apache.beam.sdk.util.state.InMemoryStateInternals.InMemoryStateBinder;
+import org.apache.beam.sdk.util.state.InMemoryStateInternals.InMemoryValue;
+import 
org.apache.beam.sdk.util.state.InMemoryStateInternals.InMemoryWatermarkHold;
+import org.apache.beam.sdk.util.state.State;
+import org.apache.beam.sdk.util.state.StateContext;
+import org.apache.beam.sdk.util.state.StateContexts;
+import org.apache.beam.sdk.util.state.StateInternals;
+import org.apache.beam.sdk.util.state.StateNamespace;
+import org.apache.beam.sdk.util.state.StateTable;
+import org.apache.beam.sdk.util.state.StateTag;
+import org.apache.beam.sdk.util.state.StateTag.StateBinder;
+import org.apache.beam.sdk.util.state.ValueState;
+import org.apache.beam.sdk.util.state.WatermarkHoldState;
+import org.joda.time.Instant;
+

[2/3] incubator-beam git commit: Move CopyOnAccessStateInternals to runners/direct

2016-12-07 Thread kenn
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/09e2f309/sdks/java/core/src/test/java/org/apache/beam/sdk/util/state/CopyOnAccessInMemoryStateInternalsTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/util/state/CopyOnAccessInMemoryStateInternalsTest.java
 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/util/state/CopyOnAccessInMemoryStateInternalsTest.java
deleted file mode 100644
index ad70bca..000
--- 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/util/state/CopyOnAccessInMemoryStateInternalsTest.java
+++ /dev/null
@@ -1,552 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.util.state;
-
-import static org.hamcrest.Matchers.containsInAnyOrder;
-import static org.hamcrest.Matchers.emptyIterable;
-import static org.hamcrest.Matchers.equalTo;
-import static org.hamcrest.Matchers.is;
-import static org.hamcrest.Matchers.nullValue;
-import static org.hamcrest.Matchers.theInstance;
-import static org.junit.Assert.assertThat;
-import static org.mockito.Mockito.never;
-import static org.mockito.Mockito.spy;
-import static org.mockito.Mockito.verify;
-
-import org.apache.beam.sdk.coders.CannotProvideCoderException;
-import org.apache.beam.sdk.coders.CoderRegistry;
-import org.apache.beam.sdk.coders.StringUtf8Coder;
-import org.apache.beam.sdk.coders.VarIntCoder;
-import org.apache.beam.sdk.testing.TestPipeline;
-import org.apache.beam.sdk.transforms.Combine.CombineFn;
-import org.apache.beam.sdk.transforms.Combine.KeyedCombineFn;
-import org.apache.beam.sdk.transforms.Sum;
-import org.apache.beam.sdk.transforms.windowing.BoundedWindow;
-import org.apache.beam.sdk.transforms.windowing.OutputTimeFn;
-import org.apache.beam.sdk.transforms.windowing.OutputTimeFns;
-import org.joda.time.Instant;
-import org.junit.Rule;
-import org.junit.Test;
-import org.junit.rules.ExpectedException;
-import org.junit.runner.RunWith;
-import org.junit.runners.JUnit4;
-
-/**
- * Tests for {@link CopyOnAccessInMemoryStateInternals}.
- */
-@RunWith(JUnit4.class)
-public class CopyOnAccessInMemoryStateInternalsTest {
-  @Rule public ExpectedException thrown = ExpectedException.none();
-  private String key = "foo";
-  @Test
-  public void testGetWithEmpty() {
-CopyOnAccessInMemoryStateInternals internals =
-CopyOnAccessInMemoryStateInternals.withUnderlying(key, null);
-StateNamespace namespace = new StateNamespaceForTest("foo");
-StateTag bagTag = StateTags.bag("foo", 
StringUtf8Coder.of());
-BagState stringBag = internals.state(namespace, bagTag);
-assertThat(stringBag.read(), emptyIterable());
-
-stringBag.add("bar");
-stringBag.add("baz");
-assertThat(stringBag.read(), containsInAnyOrder("baz", "bar"));
-
-BagState reReadStringBag = internals.state(namespace, bagTag);
-assertThat(reReadStringBag.read(), containsInAnyOrder("baz", "bar"));
-  }
-
-  @Test
-  public void testGetWithAbsentInUnderlying() {
-CopyOnAccessInMemoryStateInternals underlying =
-CopyOnAccessInMemoryStateInternals.withUnderlying(key, null);
-CopyOnAccessInMemoryStateInternals internals =
-CopyOnAccessInMemoryStateInternals.withUnderlying(key, underlying);
-
-StateNamespace namespace = new StateNamespaceForTest("foo");
-StateTag bagTag = StateTags.bag("foo", 
StringUtf8Coder.of());
-BagState stringBag = internals.state(namespace, bagTag);
-assertThat(stringBag.read(), emptyIterable());
-
-stringBag.add("bar");
-stringBag.add("baz");
-assertThat(stringBag.read(), containsInAnyOrder("baz", "bar"));
-
-BagState reReadVoidBag = internals.state(namespace, bagTag);
-assertThat(reReadVoidBag.read(), containsInAnyOrder("baz", "bar"));
-
-BagState underlyingState = underlying.state(namespace, bagTag);
-assertThat(underlyingState.read(), emptyIterable());
-  }
-
-  /**
-   * Tests that retrieving state with an underlying StateInternals with an 
existing value returns
-   * a value that initially has equal value to the provided state but can be 
modified without
-   * modifying the existing state.
-   */
-  @Test
- 

[GitHub] incubator-beam pull request #1543: [BEAM-25] Move CopyOnAccessStateInternals...

2016-12-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1543


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-27) Add user-ready API for interacting with timers

2016-12-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-27?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15730970#comment-15730970
 ] 

ASF GitHub Bot commented on BEAM-27:


Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1528


> Add user-ready API for interacting with timers
> --
>
> Key: BEAM-27
> URL: https://issues.apache.org/jira/browse/BEAM-27
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>
> Pipeline authors will benefit from a different factorization of interaction 
> with underlying timers. The current APIs are targeted at runner implementers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/5] incubator-beam git commit: Access to OnTimerContext via DoFnInvokers.ArgumentProvider

2016-12-07 Thread kenn
Repository: incubator-beam
Updated Branches:
  refs/heads/master 6807480a9 -> 772959447


Access to OnTimerContext via DoFnInvokers.ArgumentProvider


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/2883062e
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/2883062e
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/2883062e

Branch: refs/heads/master
Commit: 2883062eebe8dba849ab89627f6aeb53266ac1a8
Parents: 42b506f
Author: Kenneth Knowles 
Authored: Tue Dec 6 20:10:21 2016 -0800
Committer: Kenneth Knowles 
Committed: Wed Dec 7 19:22:43 2016 -0800

--
 .../org/apache/beam/runners/core/SimpleDoFnRunner.java | 13 +
 .../org/apache/beam/runners/core/SplittableParDo.java  |  5 +
 .../org/apache/beam/sdk/transforms/DoFnAdapters.java   | 12 
 .../org/apache/beam/sdk/transforms/DoFnTester.java |  7 +++
 .../beam/sdk/transforms/reflect/DoFnInvoker.java   |  8 
 5 files changed, 45 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/2883062e/runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java
--
diff --git 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java
 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java
index 68751f0..0d41a8d 100644
--- 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java
+++ 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java
@@ -35,6 +35,7 @@ import org.apache.beam.sdk.transforms.Combine.CombineFn;
 import org.apache.beam.sdk.transforms.DoFn;
 import org.apache.beam.sdk.transforms.DoFn.Context;
 import org.apache.beam.sdk.transforms.DoFn.InputProvider;
+import org.apache.beam.sdk.transforms.DoFn.OnTimerContext;
 import org.apache.beam.sdk.transforms.DoFn.OutputReceiver;
 import org.apache.beam.sdk.transforms.DoFn.ProcessContext;
 import org.apache.beam.sdk.transforms.reflect.DoFnInvoker;
@@ -403,6 +404,12 @@ public class SimpleDoFnRunner implements 
DoFnRunner doFn) {
+  throw new UnsupportedOperationException(
+  "Cannot access OnTimerContext outside of @OnTimer methods.");
+}
+
+@Override
 public InputProvider inputProvider() {
   throw new UnsupportedOperationException("InputProvider is for testing 
only.");
 }
@@ -589,6 +596,12 @@ public class SimpleDoFnRunner implements 
DoFnRunner doFn) {
+  throw new UnsupportedOperationException(
+  "Cannot access OnTimerContext outside of @OnTimer methods.");
+}
+
+@Override
 public InputProvider inputProvider() {
   throw new UnsupportedOperationException("InputProvider parameters are 
not supported.");
 }

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/2883062e/runners/core-java/src/main/java/org/apache/beam/runners/core/SplittableParDo.java
--
diff --git 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/SplittableParDo.java
 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/SplittableParDo.java
index 78f373b..580e842 100644
--- 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/SplittableParDo.java
+++ 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/SplittableParDo.java
@@ -663,6 +663,11 @@ public class SplittableParDo
   }
 
   @Override
+  public DoFn.OnTimerContext onTimerContext(DoFn doFn) {
+throw new IllegalStateException("Unexpected extra context access on a 
splittable DoFn");
+  }
+
+  @Override
   public DoFn.InputProvider inputProvider() {
 // DoFnSignatures should have verified that this DoFn doesn't access 
extra context.
 throw new IllegalStateException("Unexpected extra context access on a 
splittable DoFn");

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/2883062e/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFnAdapters.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFnAdapters.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFnAdapters.java
index 6ee42e7..e15b08b 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFnAdapters.java
+++ 

[5/5] incubator-beam git commit: This closes #1528

2016-12-07 Thread kenn
This closes #1528


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/77295944
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/77295944
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/77295944

Branch: refs/heads/master
Commit: 772959447a24737956ec7142f6f157807d877d64
Parents: 6807480 a26ed13
Author: Kenneth Knowles 
Authored: Wed Dec 7 19:22:44 2016 -0800
Committer: Kenneth Knowles 
Committed: Wed Dec 7 19:22:44 2016 -0800

--
 .../beam/runners/core/SimpleDoFnRunner.java | 13 +++
 .../beam/runners/core/SplittableParDo.java  |  5 ++
 .../org/apache/beam/sdk/transforms/DoFn.java| 22 +
 .../beam/sdk/transforms/DoFnAdapters.java   | 12 +++
 .../apache/beam/sdk/transforms/DoFnTester.java  |  7 ++
 .../reflect/ByteBuddyDoFnInvokerFactory.java| 11 +++
 .../sdk/transforms/reflect/DoFnInvoker.java |  8 ++
 .../sdk/transforms/reflect/DoFnSignature.java   | 26 +-
 .../sdk/transforms/reflect/DoFnSignatures.java  | 90 
 .../DoFnSignaturesSplittableDoFnTest.java   |  3 +-
 .../transforms/reflect/DoFnSignaturesTest.java  | 47 ++
 11 files changed, 226 insertions(+), 18 deletions(-)
--




[GitHub] incubator-beam pull request #1528: [BEAM-27] Add DoFn.OnTimerContext and sup...

2016-12-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1528


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[3/5] incubator-beam git commit: Add DoFn.OnTimerContext

2016-12-07 Thread kenn
Add DoFn.OnTimerContext


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/3f8c8076
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/3f8c8076
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/3f8c8076

Branch: refs/heads/master
Commit: 3f8c80769a3bb38da64c6628fd8c4669fcac794b
Parents: 6807480
Author: Kenneth Knowles 
Authored: Tue Dec 6 20:10:06 2016 -0800
Committer: Kenneth Knowles 
Committed: Wed Dec 7 19:22:43 2016 -0800

--
 .../org/apache/beam/sdk/transforms/DoFn.java| 22 
 1 file changed, 22 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/3f8c8076/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
index 7aabec9..699403f 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
@@ -43,6 +43,7 @@ import org.apache.beam.sdk.transforms.display.HasDisplayData;
 import org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker;
 import org.apache.beam.sdk.transforms.windowing.BoundedWindow;
 import org.apache.beam.sdk.transforms.windowing.PaneInfo;
+import org.apache.beam.sdk.util.TimeDomain;
 import org.apache.beam.sdk.util.Timer;
 import org.apache.beam.sdk.util.TimerSpec;
 import org.apache.beam.sdk.util.state.State;
@@ -295,6 +296,27 @@ public abstract class DoFn implements 
Serializable, HasDisplayD
   }
 
   /**
+   * Information accessible when running a {@link DoFn.OnTimer} method.
+   */
+  public abstract class OnTimerContext extends Context {
+
+/**
+ * Returns the timestamp of the current timer.
+ */
+public abstract Instant timestamp();
+
+/**
+ * Returns the window in which the timer is firing.
+ */
+public abstract BoundedWindow window();
+
+/**
+ * Returns the time domain of the current timer.
+ */
+public abstract TimeDomain timeDomain();
+  }
+
+  /**
* Returns the allowed timestamp skew duration, which is the maximum
* duration that timestamps can be shifted backward in
* {@link DoFn.Context#outputWithTimestamp}.



[4/5] incubator-beam git commit: Support OnTimerContext in ByteBuddyDoFnInvokerFactory

2016-12-07 Thread kenn
Support OnTimerContext in ByteBuddyDoFnInvokerFactory


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/a26ed134
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/a26ed134
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/a26ed134

Branch: refs/heads/master
Commit: a26ed134bc57970ed83156f93d660a637465a9d6
Parents: 2883062
Author: Kenneth Knowles 
Authored: Tue Dec 6 20:19:32 2016 -0800
Committer: Kenneth Knowles 
Committed: Wed Dec 7 19:22:44 2016 -0800

--
 .../sdk/transforms/reflect/ByteBuddyDoFnInvokerFactory.java   | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/a26ed134/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/ByteBuddyDoFnInvokerFactory.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/ByteBuddyDoFnInvokerFactory.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/ByteBuddyDoFnInvokerFactory.java
index 3480603..01ddd86 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/ByteBuddyDoFnInvokerFactory.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/ByteBuddyDoFnInvokerFactory.java
@@ -85,6 +85,7 @@ public class ByteBuddyDoFnInvokerFactory implements 
DoFnInvokerFactory {
 
   public static final String CONTEXT_PARAMETER_METHOD = "context";
   public static final String PROCESS_CONTEXT_PARAMETER_METHOD = 
"processContext";
+  public static final String ON_TIMER_CONTEXT_PARAMETER_METHOD = 
"onTimerContext";
   public static final String WINDOW_PARAMETER_METHOD = "window";
   public static final String INPUT_PROVIDER_PARAMETER_METHOD = "inputProvider";
   public static final String OUTPUT_RECEIVER_PARAMETER_METHOD = 
"outputReceiver";
@@ -556,7 +557,11 @@ public class ByteBuddyDoFnInvokerFactory implements 
DoFnInvokerFactory {
 
   @Override
   public StackManipulation dispatch(OnTimerContextParameter p) {
-throw new UnsupportedOperationException("OnTimerContext is not yet 
supported.");
+return new StackManipulation.Compound(
+pushDelegate,
+MethodInvocation.invoke(
+getExtraContextFactoryMethodDescription(
+ON_TIMER_CONTEXT_PARAMETER_METHOD, DoFn.class)));
   }
 
   @Override



[2/5] incubator-beam git commit: Add OnTimerContext parameter support to DoFnSignature

2016-12-07 Thread kenn
Add OnTimerContext parameter support to DoFnSignature


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/42b506f0
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/42b506f0
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/42b506f0

Branch: refs/heads/master
Commit: 42b506f06dbd73e03a2cfad4e7677e9698b3c020
Parents: 3f8c807
Author: Kenneth Knowles 
Authored: Tue Dec 6 20:18:18 2016 -0800
Committer: Kenneth Knowles 
Committed: Wed Dec 7 19:22:43 2016 -0800

--
 .../reflect/ByteBuddyDoFnInvokerFactory.java|  6 ++
 .../sdk/transforms/reflect/DoFnSignature.java   | 26 +-
 .../sdk/transforms/reflect/DoFnSignatures.java  | 90 
 .../DoFnSignaturesSplittableDoFnTest.java   |  3 +-
 .../transforms/reflect/DoFnSignaturesTest.java  | 47 ++
 5 files changed, 154 insertions(+), 18 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/42b506f0/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/ByteBuddyDoFnInvokerFactory.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/ByteBuddyDoFnInvokerFactory.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/ByteBuddyDoFnInvokerFactory.java
index 8750d64..3480603 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/ByteBuddyDoFnInvokerFactory.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/ByteBuddyDoFnInvokerFactory.java
@@ -69,6 +69,7 @@ import 
org.apache.beam.sdk.transforms.reflect.DoFnSignature.OnTimerMethod;
 import org.apache.beam.sdk.transforms.reflect.DoFnSignature.Parameter.Cases;
 import 
org.apache.beam.sdk.transforms.reflect.DoFnSignature.Parameter.ContextParameter;
 import 
org.apache.beam.sdk.transforms.reflect.DoFnSignature.Parameter.InputProviderParameter;
+import 
org.apache.beam.sdk.transforms.reflect.DoFnSignature.Parameter.OnTimerContextParameter;
 import 
org.apache.beam.sdk.transforms.reflect.DoFnSignature.Parameter.OutputReceiverParameter;
 import 
org.apache.beam.sdk.transforms.reflect.DoFnSignature.Parameter.ProcessContextParameter;
 import 
org.apache.beam.sdk.transforms.reflect.DoFnSignature.Parameter.RestrictionTrackerParameter;
@@ -554,6 +555,11 @@ public class ByteBuddyDoFnInvokerFactory implements 
DoFnInvokerFactory {
   }
 
   @Override
+  public StackManipulation dispatch(OnTimerContextParameter p) {
+throw new UnsupportedOperationException("OnTimerContext is not yet 
supported.");
+  }
+
+  @Override
   public StackManipulation dispatch(WindowParameter p) {
 return new StackManipulation.Compound(
 simpleExtraContextParameter(WINDOW_PARAMETER_METHOD),

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/42b506f0/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/DoFnSignature.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/DoFnSignature.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/DoFnSignature.java
index 0750949..ccc9ac3 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/DoFnSignature.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/DoFnSignature.java
@@ -175,6 +175,8 @@ public abstract class DoFnSignature {
 return cases.dispatch((ContextParameter) this);
   } else if (this instanceof ProcessContextParameter) {
 return cases.dispatch((ProcessContextParameter) this);
+  } else if (this instanceof OnTimerContextParameter) {
+return cases.dispatch((OnTimerContextParameter) this);
   } else if (this instanceof WindowParameter) {
 return cases.dispatch((WindowParameter) this);
   } else if (this instanceof RestrictionTrackerParameter) {
@@ -200,6 +202,7 @@ public abstract class DoFnSignature {
 public interface Cases {
   ResultT dispatch(ContextParameter p);
   ResultT dispatch(ProcessContextParameter p);
+  ResultT dispatch(OnTimerContextParameter p);
   ResultT dispatch(WindowParameter p);
   ResultT dispatch(InputProviderParameter p);
   ResultT dispatch(OutputReceiverParameter p);
@@ -225,6 +228,11 @@ public abstract class DoFnSignature {
 }
 
 @Override
+public ResultT dispatch(OnTimerContextParameter p) {
+  return dispatchDefault(p);
+}
+
+@Override
 public ResultT dispatch(WindowParameter p) {
   return dispatchDefault(p);
 }
@@ 

Build failed in Jenkins: beam_PostCommit_Python_Verify #841

2016-12-07 Thread Apache Jenkins Server
See 

--
[...truncated 3137 lines...]
  File 
"
 line 243, in dumps
dump(obj, file, protocol, byref, fmode, recurse)#, strictio)
  File 
"
 line 236, in dump
pik.dump(obj)
  File "/usr/lib/python2.7/pickle.py", line 224, in dump
self.save(obj)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.7/pickle.py", line 562, in save_tuple
save(element)
  File "/usr/lib/python2.7/pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
  File "/usr/lib/python2.7/pickle.py", line 419, in save_reduce
save(state)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File 
"
 line 160, in new_save_module_dict
return old_save_module_dict(pickler, obj)
  File 
"
 line 835, in save_module_dict
StockPickler.save_dict(pickler, obj)
  File "/usr/lib/python2.7/pickle.py", line 649, in save_dict
self._batch_setitems(obj.iteritems())
  File "/usr/lib/python2.7/pickle.py", line 681, in _batch_setitems
save(v)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File 
"
 line 798, in save_function
obj.__dict__), obj=obj)
  File "/usr/lib/python2.7/pickle.py", line 401, in save_reduce
save(args)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.7/pickle.py", line 562, in save_tuple
save(element)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.7/pickle.py", line 548, in save_tuple
save(element)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File 
"
 line 1039, in save_cell
pickler.save_reduce(_create_cell, (obj.cell_contents,), obj=obj)
  File "/usr/lib/python2.7/pickle.py", line 401, in save_reduce
save(args)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.7/pickle.py", line 548, in save_tuple
save(element)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File 
"
 line 798, in save_function
obj.__dict__), obj=obj)
  File "/usr/lib/python2.7/pickle.py", line 401, in save_reduce
save(args)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.7/pickle.py", line 562, in save_tuple
save(element)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.7/pickle.py", line 548, in save_tuple
save(element)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File 
"
 line 1039, in save_cell
pickler.save_reduce(_create_cell, (obj.cell_contents,), obj=obj)
  File "/usr/lib/python2.7/pickle.py", line 401, in save_reduce
save(args)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.7/pickle.py", line 548, in save_tuple
save(element)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File 
"
 line 798, in save_function
obj.__dict__), obj=obj)
  File "/usr/lib/python2.7/pickle.py", line 401, in save_reduce
save(args)
  File 

Build failed in Jenkins: beam_PostCommit_Python_Verify #840

2016-12-07 Thread Apache Jenkins Server
See 

Changes:

[ccy] Fix template_runner_test on Windows

--
[...truncated 2978 lines...]
  File 
"
 line 276, in signalhandler
raise TimedOutException()
TimedOutException: 'test_par_do_with_multiple_outputs_and_using_return 
(apache_beam.dataflow_test.DataflowTest)'

==
ERROR: test_par_do_with_multiple_outputs_and_using_yield 
(apache_beam.dataflow_test.DataflowTest)
--
Traceback (most recent call last):
  File 
"
 line 812, in run
test(orig)
  File 
"
 line 45, in __call__
return self.run(*arg, **kwarg)
  File 
"
 line 133, in run
self.runTest(result)
  File 
"
 line 151, in runTest
test(result)
  File "/usr/lib/python2.7/unittest/case.py", line 395, in __call__
return self.run(*args, **kwds)
  File "/usr/lib/python2.7/unittest/case.py", line 331, in run
testMethod()
  File 
"
 line 147, in test_par_do_with_multiple_outputs_and_using_yield
pipeline.run()
  File 
"
 line 159, in run
return self.runner.run(self)
  File 
"
 line 175, in run
pipeline.options, job_version)
  File 
"
 line 360, in __init__
credentials = get_service_credentials()
  File 
"
 line 171, in get_service_credentials
credentials.get_access_token()
  File "build/bdist.linux-x86_64/egg/oauth2client/client.py", line 677, in 
get_access_token
self.refresh(http)
  File "build/bdist.linux-x86_64/egg/oauth2client/client.py", line 560, in 
refresh
self._refresh(http.request)
  File 
"
 line 113, in _refresh
output, _ = gcloud_process.communicate()
  File "/usr/lib/python2.7/subprocess.py", line 791, in communicate
stdout = _eintr_retry_call(self.stdout.read)
  File "/usr/lib/python2.7/subprocess.py", line 476, in _eintr_retry_call
return func(*args)
  File 
"
 line 276, in signalhandler
raise TimedOutException()
TimedOutException: 'test_par_do_with_multiple_outputs_and_using_yield 
(apache_beam.dataflow_test.DataflowTest)'

==
ERROR: test_par_do_with_side_input_as_arg 
(apache_beam.dataflow_test.DataflowTest)
--
Traceback (most recent call last):
  File 
"
 line 812, in run
test(orig)
  File 
"
 line 45, in __call__
return self.run(*arg, **kwarg)
  File 
"
 line 133, in run
self.runTest(result)
  File 
"
 line 151, in runTest
test(result)
  File "/usr/lib/python2.7/unittest/case.py", line 395, in __call__
return self.run(*args, **kwds)
  File "/usr/lib/python2.7/unittest/case.py", line 331, in run
testMethod()
  File 
"
 line 94, in test_par_do_with_side_input_as_arg
pipeline.run()
  File 
"
 line 159, in run
return self.runner.run(self)
  File 

[GitHub] incubator-beam pull request #1548: Fix template_runner_test on Windows

2016-12-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1548


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] incubator-beam git commit: Fix template_runner_test on Windows

2016-12-07 Thread robertwb
Repository: incubator-beam
Updated Branches:
  refs/heads/python-sdk 4a660c604 -> 43057960a


Fix template_runner_test on Windows


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/a565ca10
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/a565ca10
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/a565ca10

Branch: refs/heads/python-sdk
Commit: a565ca1008309564dba41d551c68ea553cd83a7b
Parents: 4a660c6
Author: Charles Chen 
Authored: Wed Dec 7 16:09:49 2016 -0800
Committer: Charles Chen 
Committed: Wed Dec 7 16:09:49 2016 -0800

--
 .../apache_beam/runners/template_runner_test.py   | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/a565ca10/sdks/python/apache_beam/runners/template_runner_test.py
--
diff --git a/sdks/python/apache_beam/runners/template_runner_test.py 
b/sdks/python/apache_beam/runners/template_runner_test.py
index a141521..cc3d7c2 100644
--- a/sdks/python/apache_beam/runners/template_runner_test.py
+++ b/sdks/python/apache_beam/runners/template_runner_test.py
@@ -33,24 +33,30 @@ from apache_beam.internal import apiclient
 class TemplatingDataflowPipelineRunnerTest(unittest.TestCase):
   """TemplatingDataflow tests."""
   def test_full_completion(self):
-dummy_file = tempfile.NamedTemporaryFile()
+# Create dummy file and close it.  Note that we need to do this because
+# Windows does not allow NamedTemporaryFiles to be reopened elsewhere
+# before the temporary file is closed.
+dummy_file = tempfile.NamedTemporaryFile(delete=False)
+dummy_file_name = dummy_file.name
+dummy_file.close()
+
 dummy_dir = tempfile.mkdtemp()
 
 remote_runner = DataflowPipelineRunner()
 pipeline = Pipeline(remote_runner,
 options=PipelineOptions([
 '--dataflow_endpoint=ignored',
-'--sdk_location=' + dummy_file.name,
+'--sdk_location=' + dummy_file_name,
 '--job_name=test-job',
 '--project=test-project',
 '--staging_location=' + dummy_dir,
 '--temp_location=/dev/null',
-'--template_location=' + dummy_file.name,
+'--template_location=' + dummy_file_name,
 '--no_auth=True']))
 
 pipeline | beam.Create([1, 2, 3]) | beam.Map(lambda x: x) # pylint: 
disable=expression-not-assigned
 pipeline.run()
-with open(dummy_file.name) as template_file:
+with open(dummy_file_name) as template_file:
   saved_job_dict = json.load(template_file)
   self.assertEqual(
   saved_job_dict['environment']['sdkPipelineOptions']



[2/2] incubator-beam git commit: Closes #1548

2016-12-07 Thread robertwb
Closes #1548


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/43057960
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/43057960
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/43057960

Branch: refs/heads/python-sdk
Commit: 43057960af906ff5c435ea016d5f6df5bccaea40
Parents: 4a660c6 a565ca1
Author: Robert Bradshaw 
Authored: Wed Dec 7 18:17:39 2016 -0800
Committer: Robert Bradshaw 
Committed: Wed Dec 7 18:17:39 2016 -0800

--
 .../apache_beam/runners/template_runner_test.py   | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)
--




[jira] [Created] (BEAM-1109) Python ValidatesRunner Tests on Dataflow Service Timeout

2016-12-07 Thread Mark Liu (JIRA)
Mark Liu created BEAM-1109:
--

 Summary: Python ValidatesRunner Tests on Dataflow Service Timeout
 Key: BEAM-1109
 URL: https://issues.apache.org/jira/browse/BEAM-1109
 Project: Beam
  Issue Type: Bug
  Components: sdk-py, testing
Reporter: Mark Liu
Assignee: Mark Liu


ValidatesRunner tests timeout with following logs:
https://builds.apache.org/view/Beam/job/beam_PostCommit_Python_Verify/839/console

Need to increase "--process-timeout" in execution command 
(https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/run_postcommit.sh#L77).





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-1096) flink streaming side output optimization using SplitStream

2016-12-07 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek updated BEAM-1096:
---
Assignee: Alexey Diomin

> flink streaming side output optimization using SplitStream
> --
>
> Key: BEAM-1096
> URL: https://issues.apache.org/jira/browse/BEAM-1096
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Affects Versions: 0.4.0-incubating
>Reporter: Alexey Diomin
>Assignee: Alexey Diomin
>Priority: Minor
> Fix For: 0.4.0-incubating
>
>
> Current implementation:
> 1) send all events in all output streams
> 2) filtering streams for necessary tags
> Cons: increased cpu usage for serialization all events
> Proposed implementation:
> 1) route event in correct streams based on tag



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1520: [BEAM-1096] flink streaming side output o...

2016-12-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1520


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1096) flink streaming side output optimization using SplitStream

2016-12-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15730793#comment-15730793
 ] 

ASF GitHub Bot commented on BEAM-1096:
--

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1520


> flink streaming side output optimization using SplitStream
> --
>
> Key: BEAM-1096
> URL: https://issues.apache.org/jira/browse/BEAM-1096
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Affects Versions: 0.4.0-incubating
>Reporter: Alexey Diomin
>Priority: Minor
>
> Current implementation:
> 1) send all events in all output streams
> 2) filtering streams for necessary tags
> Cons: increased cpu usage for serialization all events
> Proposed implementation:
> 1) route event in correct streams based on tag



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/2] incubator-beam git commit: This closes #1520

2016-12-07 Thread aljoscha
This closes #1520


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/6807480a
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/6807480a
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/6807480a

Branch: refs/heads/master
Commit: 6807480a97f2315b3f48ad8dd5accb4e30475fa4
Parents: c53e0b1 f1a5704
Author: Aljoscha Krettek 
Authored: Thu Dec 8 09:55:32 2016 +0800
Committer: Aljoscha Krettek 
Committed: Thu Dec 8 09:55:32 2016 +0800

--
 .../FlinkStreamingTransformTranslators.java | 28 +---
 1 file changed, 18 insertions(+), 10 deletions(-)
--




[1/2] incubator-beam git commit: [BEAM-1096] Flink streaming side output optimization using SplitStream

2016-12-07 Thread aljoscha
Repository: incubator-beam
Updated Branches:
  refs/heads/master c53e0b162 -> 6807480a9


[BEAM-1096] Flink streaming side output optimization using SplitStream


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/f1a5704a
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/f1a5704a
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/f1a5704a

Branch: refs/heads/master
Commit: f1a5704a505b01d7d4649b61d1f6697859367964
Parents: c53e0b1
Author: Alexey Diomin 
Authored: Wed Dec 7 09:48:35 2016 +0400
Committer: Aljoscha Krettek 
Committed: Thu Dec 8 09:55:22 2016 +0800

--
 .../FlinkStreamingTransformTranslators.java | 28 +---
 1 file changed, 18 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/f1a5704a/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/FlinkStreamingTransformTranslators.java
--
diff --git 
a/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/FlinkStreamingTransformTranslators.java
 
b/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/FlinkStreamingTransformTranslators.java
index 47935eb..7b32c76 100644
--- 
a/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/FlinkStreamingTransformTranslators.java
+++ 
b/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/FlinkStreamingTransformTranslators.java
@@ -78,11 +78,13 @@ import 
org.apache.flink.api.common.functions.RichFlatMapFunction;
 import org.apache.flink.api.common.typeinfo.TypeInformation;
 import org.apache.flink.api.java.tuple.Tuple2;
 import org.apache.flink.core.fs.FileSystem;
+import org.apache.flink.streaming.api.collector.selector.OutputSelector;
 import org.apache.flink.streaming.api.datastream.DataStream;
 import org.apache.flink.streaming.api.datastream.DataStreamSink;
 import org.apache.flink.streaming.api.datastream.DataStreamSource;
 import org.apache.flink.streaming.api.datastream.KeyedStream;
 import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
+import org.apache.flink.streaming.api.datastream.SplitStream;
 import org.apache.flink.streaming.api.functions.AssignerWithPeriodicWatermarks;
 import org.apache.flink.streaming.api.operators.OneInputStreamOperator;
 import org.apache.flink.streaming.api.operators.TwoInputStreamOperator;
@@ -554,6 +556,14 @@ public class FlinkStreamingTransformTranslators {
 .transform(transform.getName(), outputUnionTypeInformation, 
doFnOperator);
   }
 
+  SplitStream splitStream = unionOutputStream
+  .split(new OutputSelector() {
+@Override
+public Iterable select(RawUnionValue value) {
+  return 
Collections.singletonList(Integer.toString(value.getUnionTag()));
+}
+  });
+
   for (Map.Entry output : outputs.entrySet()) 
{
 final int outputTag = tagsToLabels.get(output.getKey());
 
@@ -561,17 +571,15 @@ public class FlinkStreamingTransformTranslators {
 context.getTypeInfo(output.getValue());
 
 @SuppressWarnings("unchecked")
-DataStream filtered =
-unionOutputStream.flatMap(new FlatMapFunction() {
-  @Override
-  public void flatMap(RawUnionValue value, Collector out) 
throws Exception {
-if (value.getUnionTag() == outputTag) {
-  out.collect(value.getValue());
-}
-  }
-}).returns(outputTypeInfo);
+DataStream unwrapped = splitStream.select(String.valueOf(outputTag))
+  .flatMap(new FlatMapFunction() {
+@Override
+public void flatMap(RawUnionValue value, Collector out) 
throws Exception {
+  out.collect(value.getValue());
+}
+  }).returns(outputTypeInfo);
 
-context.setOutputDataStream(output.getValue(), filtered);
+context.setOutputDataStream(output.getValue(), unwrapped);
   }
 }
 



[jira] [Updated] (BEAM-1095) Add support set config for reuse-object on flink

2016-12-07 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek updated BEAM-1095:
---
Assignee: Alexey Diomin

> Add support set config for reuse-object on flink
> 
>
> Key: BEAM-1095
> URL: https://issues.apache.org/jira/browse/BEAM-1095
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Affects Versions: 0.4.0-incubating
>Reporter: Alexey Diomin
>Assignee: Alexey Diomin
>Priority: Trivial
> Fix For: 0.4.0-incubating
>
>
> Object-reuse is dangerous setting and disabled by default,
> but sometime we need use this option to omit performance overhead for 
> serialization-deserialization objects on every transformations 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1518: [BEAM-1095] Add support set config for re...

2016-12-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1518


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Closed] (BEAM-1095) Add support set config for reuse-object on flink

2016-12-07 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek closed BEAM-1095.
--
   Resolution: Fixed
Fix Version/s: 0.4.0-incubating

> Add support set config for reuse-object on flink
> 
>
> Key: BEAM-1095
> URL: https://issues.apache.org/jira/browse/BEAM-1095
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Affects Versions: 0.4.0-incubating
>Reporter: Alexey Diomin
>Assignee: Alexey Diomin
>Priority: Trivial
> Fix For: 0.4.0-incubating
>
>
> Object-reuse is dangerous setting and disabled by default,
> but sometime we need use this option to omit performance overhead for 
> serialization-deserialization objects on every transformations 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/2] incubator-beam git commit: This closes #1518

2016-12-07 Thread aljoscha
This closes #1518


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/c53e0b16
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/c53e0b16
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/c53e0b16

Branch: refs/heads/master
Commit: c53e0b1623e3ee3c08c329e2716440f031681591
Parents: 3b2e029 1b12520
Author: Aljoscha Krettek 
Authored: Thu Dec 8 09:44:07 2016 +0800
Committer: Aljoscha Krettek 
Committed: Thu Dec 8 09:44:07 2016 +0800

--
 .../flink/FlinkPipelineExecutionEnvironment.java| 12 
 .../apache/beam/runners/flink/FlinkPipelineOptions.java |  5 +
 2 files changed, 17 insertions(+)
--




[jira] [Commented] (BEAM-1095) Add support set config for reuse-object on flink

2016-12-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15730767#comment-15730767
 ] 

ASF GitHub Bot commented on BEAM-1095:
--

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1518


> Add support set config for reuse-object on flink
> 
>
> Key: BEAM-1095
> URL: https://issues.apache.org/jira/browse/BEAM-1095
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Affects Versions: 0.4.0-incubating
>Reporter: Alexey Diomin
>Priority: Trivial
>
> Object-reuse is dangerous setting and disabled by default,
> but sometime we need use this option to omit performance overhead for 
> serialization-deserialization objects on every transformations 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/2] incubator-beam git commit: [BEAM-1095] Add support set config for reuse-object on flink

2016-12-07 Thread aljoscha
Repository: incubator-beam
Updated Branches:
  refs/heads/master 3b2e0290d -> c53e0b162


[BEAM-1095] Add support set config for reuse-object on flink


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/1b125207
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/1b125207
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/1b125207

Branch: refs/heads/master
Commit: 1b1252074dd6b57f4fb88ceb82c704d3d3d8147f
Parents: 3b2e029
Author: Alexey Diomin 
Authored: Wed Dec 7 09:39:27 2016 +0400
Committer: Aljoscha Krettek 
Committed: Thu Dec 8 09:44:00 2016 +0800

--
 .../flink/FlinkPipelineExecutionEnvironment.java| 12 
 .../apache/beam/runners/flink/FlinkPipelineOptions.java |  5 +
 2 files changed, 17 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/1b125207/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/FlinkPipelineExecutionEnvironment.java
--
diff --git 
a/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/FlinkPipelineExecutionEnvironment.java
 
b/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/FlinkPipelineExecutionEnvironment.java
index 391c3f2..69dcd5e 100644
--- 
a/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/FlinkPipelineExecutionEnvironment.java
+++ 
b/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/FlinkPipelineExecutionEnvironment.java
@@ -159,6 +159,12 @@ public class FlinkPipelineExecutionEnvironment {
 // set parallelism in the options (required by some execution code)
 options.setParallelism(flinkBatchEnv.getParallelism());
 
+if (options.getObjectReuse()) {
+  flinkBatchEnv.getConfig().enableObjectReuse();
+} else {
+  flinkBatchEnv.getConfig().disableObjectReuse();
+}
+
 return flinkBatchEnv;
   }
 
@@ -197,6 +203,12 @@ public class FlinkPipelineExecutionEnvironment {
 // set parallelism in the options (required by some execution code)
 options.setParallelism(flinkStreamEnv.getParallelism());
 
+if (options.getObjectReuse()) {
+  flinkStreamEnv.getConfig().enableObjectReuse();
+} else {
+  flinkStreamEnv.getConfig().disableObjectReuse();
+}
+
 // default to event time
 flinkStreamEnv.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
 

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/1b125207/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/FlinkPipelineOptions.java
--
diff --git 
a/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/FlinkPipelineOptions.java
 
b/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/FlinkPipelineOptions.java
index be99f29..3bb358e 100644
--- 
a/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/FlinkPipelineOptions.java
+++ 
b/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/FlinkPipelineOptions.java
@@ -83,6 +83,11 @@ public interface FlinkPipelineOptions
   Long getExecutionRetryDelay();
   void setExecutionRetryDelay(Long delay);
 
+  @Description("Sets the behavior of reusing objects.")
+  @Default.Boolean(false)
+  Boolean getObjectReuse();
+  void setObjectReuse(Boolean reuse);
+
   /**
* Sets a state backend to store Beam's state during computation.
* Note: Only applicable when executing in streaming mode.



[jira] [Commented] (BEAM-1108) Remove deprecated Dataflow Runner options and update documentation

2016-12-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15730723#comment-15730723
 ] 

ASF GitHub Bot commented on BEAM-1108:
--

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1546


> Remove deprecated Dataflow Runner options and update documentation
> --
>
> Key: BEAM-1108
> URL: https://issues.apache.org/jira/browse/BEAM-1108
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Affects Versions: Not applicable
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
>Priority: Minor
> Fix For: Not applicable
>
>
> Umbrella bug for removing deprecated {{DataflowPipelineXOptions}} 
> configurations, plus improving documentation. Will update bug description as 
> more tasks arise.
> 1. Remove the {{TEARDOWN_POLICY}} option.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1546: [BEAM-1108] DataflowRunner: remove deprec...

2016-12-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1546


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] incubator-beam git commit: Closes #1546

2016-12-07 Thread dhalperi
Closes #1546


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/3b2e0290
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/3b2e0290
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/3b2e0290

Branch: refs/heads/master
Commit: 3b2e0290ddbedd199926a70f36c02ea4515841cb
Parents: b44a7ac 6439f70
Author: Dan Halperin 
Authored: Wed Dec 7 17:18:11 2016 -0800
Committer: Dan Halperin 
Committed: Wed Dec 7 17:18:11 2016 -0800

--
 .../dataflow/DataflowPipelineTranslator.java|  4 --
 .../DataflowPipelineWorkerPoolOptions.java  | 45 
 2 files changed, 49 deletions(-)
--




[1/2] incubator-beam git commit: [BEAM-1108] DataflowRunner: remove deprecated TEARDOWN_POLICY control

2016-12-07 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/master b44a7ac4a -> 3b2e0290d


[BEAM-1108] DataflowRunner: remove deprecated TEARDOWN_POLICY control


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/6439f701
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/6439f701
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/6439f701

Branch: refs/heads/master
Commit: 6439f701d1008d6a0432828e11e0fcc8a4fe6ecc
Parents: b44a7ac
Author: Dan Halperin 
Authored: Thu Dec 8 07:40:58 2016 +0800
Committer: Dan Halperin 
Committed: Thu Dec 8 07:43:13 2016 +0800

--
 .../dataflow/DataflowPipelineTranslator.java|  4 --
 .../DataflowPipelineWorkerPoolOptions.java  | 45 
 2 files changed, 49 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/6439f701/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java
index 8783056..8048df9 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java
@@ -424,10 +424,6 @@ public class DataflowPipelineTranslator {
 
   WorkerPool workerPool = new WorkerPool();
 
-  if (options.getTeardownPolicy() != null) {
-
workerPool.setTeardownPolicy(options.getTeardownPolicy().getTeardownPolicyName());
-  }
-
   if (options.isStreaming()) {
 job.setType("JOB_TYPE_STREAMING");
   } else {

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/6439f701/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineWorkerPoolOptions.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineWorkerPoolOptions.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineWorkerPoolOptions.java
index ffb5a3a..157321a 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineWorkerPoolOptions.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineWorkerPoolOptions.java
@@ -191,51 +191,6 @@ public interface DataflowPipelineWorkerPoolOptions extends 
PipelineOptions {
   void setWorkerMachineType(String value);
 
   /**
-   * The policy for tearing down the workers spun up by the service.
-   *
-   * @deprecated Dataflow Service will only support TEARDOWN_ALWAYS policy in 
the future.
-   */
-  @Deprecated
-  enum TeardownPolicy {
-/**
- * All VMs created for a Dataflow job are deleted when the job finishes, 
regardless of whether
- * it fails or succeeds.
- */
-TEARDOWN_ALWAYS("TEARDOWN_ALWAYS"),
-/**
- * All VMs created for a Dataflow job are left running when the job 
finishes, regardless of
- * whether it fails or succeeds.
- */
-TEARDOWN_NEVER("TEARDOWN_NEVER"),
-/**
- * All VMs created for a Dataflow job are deleted when the job succeeds, 
but are left running
- * when it fails. (This is typically used for debugging failing jobs by 
SSHing into the
- * workers.)
- */
-TEARDOWN_ON_SUCCESS("TEARDOWN_ON_SUCCESS");
-
-private final String teardownPolicy;
-
-TeardownPolicy(String teardownPolicy) {
-  this.teardownPolicy = teardownPolicy;
-}
-
-public String getTeardownPolicyName() {
-  return this.teardownPolicy;
-}
-  }
-
-  /**
-   * The teardown policy for the VMs.
-   *
-   * If unset, the Dataflow service will choose a reasonable default.
-   */
-  @Description("The teardown policy for the VMs. If unset, the Dataflow 
service will "
-  + "choose a reasonable default.")
-  TeardownPolicy getTeardownPolicy();
-  void setTeardownPolicy(TeardownPolicy value);
-
-  /**
* List of local files to make available to workers.
*
* Files are placed on the worker's classpath.



[GitHub] incubator-beam pull request #1549: Fix handling of null ValueProviders in Di...

2016-12-07 Thread sammcveety
GitHub user sammcveety opened a pull request:

https://github.com/apache/incubator-beam/pull/1549

Fix handling of null ValueProviders in DisplayData

R: @swegner 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sammcveety/incubator-beam sgmc/dd_null

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1549.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1549


commit 6542b8f77e311be75706aab2f17c78a0aef4d0ef
Author: Sam McVeety 
Date:   2016-12-07T23:31:52Z

Fix handling of null ValueProviders in DisplayData




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-beam pull request #1548: Fix template_runner_test on Windows

2016-12-07 Thread charlesccychen
GitHub user charlesccychen opened a pull request:

https://github.com/apache/incubator-beam/pull/1548

Fix template_runner_test on Windows



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/charlesccychen/incubator-beam 
fix-template-runner-windows

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1548.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1548


commit a565ca1008309564dba41d551c68ea553cd83a7b
Author: Charles Chen 
Date:   2016-12-08T00:09:49Z

Fix template_runner_test on Windows




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1107) Display user names for steps in the Flink Web UI

2016-12-07 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15730375#comment-15730375
 ] 

Aljoscha Krettek commented on BEAM-1107:


Yep, you're right but even in the black text the operation names (MapPartition, 
GroupCombine and so on) are hardcoded in Flink right now so we cannot change 
that coming from Beam-on-Flink. Changing that would require changes to Flink 
(which I'm not opposed to).

> Display user names for steps in the Flink Web UI
> 
>
> Key: BEAM-1107
> URL: https://issues.apache.org/jira/browse/BEAM-1107
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Daniel Halperin
>Assignee: Aljoscha Krettek
> Attachments: screenshot-1.png
>
>
> [copying in-person / email discussion at Strata Singapore to JIRA]
> The FlinkBatchTransformTranslators use transform.getName() [1] -- this is the 
> "SDK name" for the transform.
> The "user name" for the transform is not available here, it is in fact on the 
> TransformHierarchy.Node as node.getFullName() [2].
> getFullName() is used some in Flink, but not when setting step names.
> I drafted a quick commit that sort of propagates the user names to the web UI 
> (but only for DataSource, and still too verbose: 
> https://github.com/dhalperi/incubator-beam/commit/a2f1fb06b22a85ec738e4f2a604c9a129891916c)
> Before this change, the "ReadLines" step showed up as: "DataSource (at 
> Read(CompressedSource) 
> (org.apache.beam.runners.flink.translation.wrappers.SourceInputFormat))"
> With this change, it shows up as "DataSource (at ReadLines/Read 
> (org.apache.beam.runners.flink.translation.wrappers.SourceInputFormat))"
> which I think is closer. [I'd still like it to JUST be "ReadLines/Read" e.g.].
> Thoughts?
> [1] 
> https://github.com/apache/incubator-beam/blob/master/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/FlinkBatchTransformTranslators.java#L129
> [2] 
> https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/TransformHierarchy.java#L252



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-1108) Remove deprecated Dataflow Runner options and update documentation

2016-12-07 Thread Neelesh Srinivas Salian (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15730373#comment-15730373
 ] 

Neelesh Srinivas Salian commented on BEAM-1108:
---

[~dhalp...@google.com] subtasks would be good to have to spread the work if 
there are multiple parts to this..

> Remove deprecated Dataflow Runner options and update documentation
> --
>
> Key: BEAM-1108
> URL: https://issues.apache.org/jira/browse/BEAM-1108
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Affects Versions: Not applicable
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
>Priority: Minor
> Fix For: Not applicable
>
>
> Umbrella bug for removing deprecated {{DataflowPipelineXOptions}} 
> configurations, plus improving documentation. Will update bug description as 
> more tasks arise.
> 1. Remove the {{TEARDOWN_POLICY}} option.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1547: [BEAM-646] Add PTransformOverrideFactory ...

2016-12-07 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/1547

[BEAM-646] Add PTransformOverrideFactory to the Core SDK

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This migrates PTransformOverrideFactory from the DirectRunner to the
Core SDK, as part of BEAM-646.

Migrate all DirectRunner Override Factories to the new
PTransformOverrideFactory.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam override_factory_in_core

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1547.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1547


commit ad7aa03e12694bceb29906d2bb9df1ce009a1df2
Author: Thomas Groh 
Date:   2016-12-06T00:01:57Z

Add PTransformOverrideFactory to the Core SDK

This migrates PTransformOverrideFactory from the DirectRunner to the
Core SDK, as part of BEAM-646.

Add getOriginalToReplacements to provide a mapping from the original
outputs to replaced outputs. This enables all replaced nodes to be
rewired to output the original output.

Migrate all DirectRunner Override Factories to the new
PTransformOverrideFactory.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-646) Get runners out of the apply()

2016-12-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15730359#comment-15730359
 ] 

ASF GitHub Bot commented on BEAM-646:
-

GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/1547

[BEAM-646] Add PTransformOverrideFactory to the Core SDK

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This migrates PTransformOverrideFactory from the DirectRunner to the
Core SDK, as part of BEAM-646.

Migrate all DirectRunner Override Factories to the new
PTransformOverrideFactory.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam override_factory_in_core

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1547.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1547


commit ad7aa03e12694bceb29906d2bb9df1ce009a1df2
Author: Thomas Groh 
Date:   2016-12-06T00:01:57Z

Add PTransformOverrideFactory to the Core SDK

This migrates PTransformOverrideFactory from the DirectRunner to the
Core SDK, as part of BEAM-646.

Add getOriginalToReplacements to provide a mapping from the original
outputs to replaced outputs. This enables all replaced nodes to be
rewired to output the original output.

Migrate all DirectRunner Override Factories to the new
PTransformOverrideFactory.




> Get runners out of the apply()
> --
>
> Key: BEAM-646
> URL: https://issues.apache.org/jira/browse/BEAM-646
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Thomas Groh
>
> Right now, the runner intercepts calls to apply() and replaces transforms as 
> we go. This means that there is no "original" user graph. For portability and 
> misc architectural benefits, we would like to build the original graph first, 
> and have the runner override later.
> Some runners already work in this manner, but we could integrate it more 
> smoothly, with more validation, via some handy APIs on e.g. the Pipeline 
> object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1546: [BEAM-1108] DataflowRunner: remove deprec...

2016-12-07 Thread dhalperi
GitHub user dhalperi opened a pull request:

https://github.com/apache/incubator-beam/pull/1546

[BEAM-1108] DataflowRunner: remove deprecated TEARDOWN_POLICY control

R: @davorbonaci 
CC: @pjesa

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/incubator-beam teardown-policy

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1546.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1546


commit 6439f701d1008d6a0432828e11e0fcc8a4fe6ecc
Author: Dan Halperin 
Date:   2016-12-07T23:40:58Z

[BEAM-1108] DataflowRunner: remove deprecated TEARDOWN_POLICY control




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1107) Display user names for steps in the Flink Web UI

2016-12-07 Thread Daniel Halperin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15730313#comment-15730313
 ] 

Daniel Halperin commented on BEAM-1107:
---

Ack -- I guess I have this intuition there's opportunity for more cleanup, but 
I may be wrong (or it may be a Flink-general, not Beam-on-Flink issue).

E.g., look at the attached screenshot:

* The name (grey) at the top is MapPartition -> Map -> GroupCombine -> Map
* The name of the steps (black) includes the identical as the grey, with 
additionally (step name)
* The Operation: text (small, grey) at the bottom includes the same (almost - 
logical vs physical?) information, although there appears to be some HTML error 
with inserting a break tag inside another break tag.


> Display user names for steps in the Flink Web UI
> 
>
> Key: BEAM-1107
> URL: https://issues.apache.org/jira/browse/BEAM-1107
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Daniel Halperin
>Assignee: Aljoscha Krettek
> Attachments: screenshot-1.png
>
>
> [copying in-person / email discussion at Strata Singapore to JIRA]
> The FlinkBatchTransformTranslators use transform.getName() [1] -- this is the 
> "SDK name" for the transform.
> The "user name" for the transform is not available here, it is in fact on the 
> TransformHierarchy.Node as node.getFullName() [2].
> getFullName() is used some in Flink, but not when setting step names.
> I drafted a quick commit that sort of propagates the user names to the web UI 
> (but only for DataSource, and still too verbose: 
> https://github.com/dhalperi/incubator-beam/commit/a2f1fb06b22a85ec738e4f2a604c9a129891916c)
> Before this change, the "ReadLines" step showed up as: "DataSource (at 
> Read(CompressedSource) 
> (org.apache.beam.runners.flink.translation.wrappers.SourceInputFormat))"
> With this change, it shows up as "DataSource (at ReadLines/Read 
> (org.apache.beam.runners.flink.translation.wrappers.SourceInputFormat))"
> which I think is closer. [I'd still like it to JUST be "ReadLines/Read" e.g.].
> Thoughts?
> [1] 
> https://github.com/apache/incubator-beam/blob/master/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/FlinkBatchTransformTranslators.java#L129
> [2] 
> https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/TransformHierarchy.java#L252



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-1107) Display user names for steps in the Flink Web UI

2016-12-07 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin updated BEAM-1107:
--
Attachment: screenshot-1.png

> Display user names for steps in the Flink Web UI
> 
>
> Key: BEAM-1107
> URL: https://issues.apache.org/jira/browse/BEAM-1107
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Daniel Halperin
>Assignee: Aljoscha Krettek
> Attachments: screenshot-1.png
>
>
> [copying in-person / email discussion at Strata Singapore to JIRA]
> The FlinkBatchTransformTranslators use transform.getName() [1] -- this is the 
> "SDK name" for the transform.
> The "user name" for the transform is not available here, it is in fact on the 
> TransformHierarchy.Node as node.getFullName() [2].
> getFullName() is used some in Flink, but not when setting step names.
> I drafted a quick commit that sort of propagates the user names to the web UI 
> (but only for DataSource, and still too verbose: 
> https://github.com/dhalperi/incubator-beam/commit/a2f1fb06b22a85ec738e4f2a604c9a129891916c)
> Before this change, the "ReadLines" step showed up as: "DataSource (at 
> Read(CompressedSource) 
> (org.apache.beam.runners.flink.translation.wrappers.SourceInputFormat))"
> With this change, it shows up as "DataSource (at ReadLines/Read 
> (org.apache.beam.runners.flink.translation.wrappers.SourceInputFormat))"
> which I think is closer. [I'd still like it to JUST be "ReadLines/Read" e.g.].
> Thoughts?
> [1] 
> https://github.com/apache/incubator-beam/blob/master/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/FlinkBatchTransformTranslators.java#L129
> [2] 
> https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/TransformHierarchy.java#L252



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1545: [BEAM-551] Fix handling of TextIO.Sink

2016-12-07 Thread sammcveety
GitHub user sammcveety opened a pull request:

https://github.com/apache/incubator-beam/pull/1545

[BEAM-551] Fix handling of TextIO.Sink

R: @dhalperi 

Directory needs to be parameterized.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sammcveety/incubator-beam sgmc/text_io_write

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1545.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1545


commit d017fde2d063765a73b290e1b1e1b849f147910f
Author: Sam McVeety 
Date:   2016-12-07T21:27:53Z

[BEAM-551] Fix handling of TextIO.Sink

commit 9309c9389d9e9fa2cae3f7378692d0484ddc54b2
Author: Sam McVeety 
Date:   2016-12-07T22:09:41Z

Fix test




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-551) Support Dynamic PipelineOptions

2016-12-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15730289#comment-15730289
 ] 

ASF GitHub Bot commented on BEAM-551:
-

GitHub user sammcveety opened a pull request:

https://github.com/apache/incubator-beam/pull/1545

[BEAM-551] Fix handling of TextIO.Sink

R: @dhalperi 

Directory needs to be parameterized.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sammcveety/incubator-beam sgmc/text_io_write

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1545.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1545


commit d017fde2d063765a73b290e1b1e1b849f147910f
Author: Sam McVeety 
Date:   2016-12-07T21:27:53Z

[BEAM-551] Fix handling of TextIO.Sink

commit 9309c9389d9e9fa2cae3f7378692d0484ddc54b2
Author: Sam McVeety 
Date:   2016-12-07T22:09:41Z

Fix test




> Support Dynamic PipelineOptions
> ---
>
> Key: BEAM-551
> URL: https://issues.apache.org/jira/browse/BEAM-551
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Sam McVeety
>Assignee: Sam McVeety
>Priority: Minor
>
> During the graph construction phase, the given SDK generates an initial
> execution graph for the program.  At execution time, this graph is
> executed, either locally or by a service.  Currently, Beam only supports
> parameterization at graph construction time.  Both Flink and Spark supply
> functionality that allows a pre-compiled job to be run without SDK
> interaction with updated runtime parameters.
> In its current incarnation, Dataflow can read values of PipelineOptions at
> job submission time, but this requires the presence of an SDK to properly
> encode these values into the job.  We would like to build a common layer
> into the Beam model so that these dynamic options can be properly provided
> to jobs.
> Please see
> https://docs.google.com/document/d/1I-iIgWDYasb7ZmXbGBHdok_IK1r1YAJ90JG5Fz0_28o/edit
> for the high-level model, and
> https://docs.google.com/document/d/17I7HeNQmiIfOJi0aI70tgGMMkOSgGi8ZUH-MOnFatZ8/edit
> for
> the specific API proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1544: Handle empty batches in GcsIO batch metho...

2016-12-07 Thread charlesccychen
GitHub user charlesccychen opened a pull request:

https://github.com/apache/incubator-beam/pull/1544

Handle empty batches in GcsIO batch methods



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/charlesccychen/incubator-beam 
gcsio-empty-batches

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1544.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1544


commit ba6b0a2295a5c43fed8673960a7ef28964964ecf
Author: Charles Chen 
Date:   2016-12-07T23:03:01Z

Handle empty batches in GcsIO batch methods




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1107) Display user names for steps in the Flink Web UI

2016-12-07 Thread Daniel Halperin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15730212#comment-15730212
 ] 

Daniel Halperin commented on BEAM-1107:
---

Also copying [~aljoscha]'s response :)

{quote}
I think we can get it down to "Data Source (ReadLines/Read)" (and similarly for 
other operators). The problem is that the String parameter is not the correct 
way to set the name of the operator but some other (admittedly weird) thing 
called "location name". To set the name we have to call .name(String) on the 
created operator after creating it.
{quote}

> Display user names for steps in the Flink Web UI
> 
>
> Key: BEAM-1107
> URL: https://issues.apache.org/jira/browse/BEAM-1107
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Daniel Halperin
>Assignee: Aljoscha Krettek
>
> [copying in-person / email discussion at Strata Singapore to JIRA]
> The FlinkBatchTransformTranslators use transform.getName() [1] -- this is the 
> "SDK name" for the transform.
> The "user name" for the transform is not available here, it is in fact on the 
> TransformHierarchy.Node as node.getFullName() [2].
> getFullName() is used some in Flink, but not when setting step names.
> I drafted a quick commit that sort of propagates the user names to the web UI 
> (but only for DataSource, and still too verbose: 
> https://github.com/dhalperi/incubator-beam/commit/a2f1fb06b22a85ec738e4f2a604c9a129891916c)
> Before this change, the "ReadLines" step showed up as: "DataSource (at 
> Read(CompressedSource) 
> (org.apache.beam.runners.flink.translation.wrappers.SourceInputFormat))"
> With this change, it shows up as "DataSource (at ReadLines/Read 
> (org.apache.beam.runners.flink.translation.wrappers.SourceInputFormat))"
> which I think is closer. [I'd still like it to JUST be "ReadLines/Read" e.g.].
> Thoughts?
> [1] 
> https://github.com/apache/incubator-beam/blob/master/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/FlinkBatchTransformTranslators.java#L129
> [2] 
> https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/TransformHierarchy.java#L252



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-1107) Display user names for steps in the Flink Web UI

2016-12-07 Thread Daniel Halperin (JIRA)
Daniel Halperin created BEAM-1107:
-

 Summary: Display user names for steps in the Flink Web UI
 Key: BEAM-1107
 URL: https://issues.apache.org/jira/browse/BEAM-1107
 Project: Beam
  Issue Type: Improvement
  Components: runner-flink
Reporter: Daniel Halperin
Assignee: Aljoscha Krettek


[copying in-person / email discussion at Strata Singapore to JIRA]


The FlinkBatchTransformTranslators use transform.getName() [1] -- this is the 
"SDK name" for the transform.

The "user name" for the transform is not available here, it is in fact on the 
TransformHierarchy.Node as node.getFullName() [2].

getFullName() is used some in Flink, but not when setting step names.

I drafted a quick commit that sort of propagates the user names to the web UI 
(but only for DataSource, and still too verbose: 
https://github.com/dhalperi/incubator-beam/commit/a2f1fb06b22a85ec738e4f2a604c9a129891916c)

Before this change, the "ReadLines" step showed up as: "DataSource (at 
Read(CompressedSource) 
(org.apache.beam.runners.flink.translation.wrappers.SourceInputFormat))"

With this change, it shows up as "DataSource (at ReadLines/Read 
(org.apache.beam.runners.flink.translation.wrappers.SourceInputFormat))"

which I think is closer. [I'd still like it to JUST be "ReadLines/Read" e.g.].

Thoughts?


[1] 
https://github.com/apache/incubator-beam/blob/master/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/FlinkBatchTransformTranslators.java#L129
[2] 
https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/TransformHierarchy.java#L252



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-905) Archetype pom needs to generalize dependencies

2016-12-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15730204#comment-15730204
 ] 

ASF GitHub Bot commented on BEAM-905:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1533


> Archetype pom needs to generalize dependencies
> --
>
> Key: BEAM-905
> URL: https://issues.apache.org/jira/browse/BEAM-905
> Project: Beam
>  Issue Type: Bug
>Affects Versions: 0.4.0-incubating
> Environment: Currently the archetype pom includes the direct runner 
> and the dataflow one, but not the others. It should do the same magic as the 
> main examples.
>Reporter: Frances Perry
>Assignee: Daniel Halperin
> Fix For: Not applicable
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1533: [BEAM-905] Add shading config to examples...

2016-12-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1533


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-975) Issue with MongoDBIO

2016-12-07 Thread Reza Nouri (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15730202#comment-15730202
 ] 

Reza Nouri commented on BEAM-975:
-

Hey [~jbonofre],

Here is from mongo log before failure:

2016-12-08T09:42:36.346+1100 I NETWORK  [initandlisten] Listener: accept() 
returns -1 errno:24 Too many open files
2016-12-08T09:42:36.346+1100 E NETWORK  [initandlisten] Out of file 
descriptors. Waiting one second before trying to accept more connections.
2016-12-08T09:42:37.743+1100 E STORAGE  [thread1] WiredTiger (24) 
[1481150557:743579][1523:0x75b05000], log-server: data/db/journal: 
directory-list: opendir: Too many open files
2016-12-08T09:42:37.743+1100 E STORAGE  [thread1] WiredTiger (24) 
[1481150557:743728][1523:0x75b05000], log-server: journal: directory-list, 
prefix "WiredTigerPreplog": Too many open files
2016-12-08T09:42:37.743+1100 E STORAGE  [thread1] WiredTiger (24) 
[1481150557:743750][1523:0x75b05000], log-server: log pre-alloc server 
error: Too many open files
2016-12-08T09:42:37.743+1100 E STORAGE  [thread1] WiredTiger (24) 
[1481150557:743768][1523:0x75b05000], log-server: log server error: Too 
many open files
2016-12-08T09:42:47.005+1100 W FTDC [ftdc] Uncaught exception in 
'FileNotOpen: Failed to open interim file 
data/db/diagnostic.data/metrics.interim.temp' in full-time diagnostic data 
capture subsystem. Shutting down the full-time diagnostic data capture 
subsystem.
2016-12-08T09:43:27.758+1100 I NETWORK  [initandlisten] Listener: accept() 
returns -1 errno:24 Too many open files
2016-12-08T09:43:27.758+1100 E NETWORK  [initandlisten] Out of file 
descriptors. Waiting one second before trying to accept more connections.
2016-12-08T09:43:28.635+1100 W NETWORK  [HostnameCanonicalizationWorker] Failed 
to obtain address information for hostname dyn: nodename nor servname provided, 
or not known
2016-12-08T09:43:28.759+1100 I NETWORK  [initandlisten] Listener: accept() 
returns -1 errno:24 Too many open files
2016-12-08T09:43:28.759+1100 E NETWORK  [initandlisten] Out of file 
descriptors. Waiting one second before trying to accept more connections.
2016-12-08T09:43:29.021+1100 E STORAGE  [thread2] WiredTiger (24) 
[1481150609:20956][1523:0x75c8e000], file:WiredTiger.wt, 
WT_SESSION.checkpoint: data/db/WiredTiger.turtle: handle-open: open: Too many 
open files
2016-12-08T09:43:29.021+1100 E STORAGE  [thread2] WiredTiger (24) 
[1481150609:21326][1523:0x75c8e000], checkpoint-server: checkpoint server 
error: Too many open files
2016-12-08T09:43:29.021+1100 E STORAGE  [thread2] WiredTiger (-31804) 
[1481150609:21355][1523:0x75c8e000], checkpoint-server: the process must 
exit and restart: WT_PANIC: WiredTiger library panic
2016-12-08T09:43:29.021+1100 I -[thread2] Fatal Assertion 28558
2016-12-08T09:43:29.021+1100 I -[thread2] 

***aborting after fassert() failure


2016-12-08T09:43:29.029+1100 F -[thread2] Got signal: 6 (Abort trap: 6).

And then it throws connection timeout exception:

SEVERE: Servlet.service() for servlet [Curation] in context with path [] threw 
exception [org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
com.mongodb.MongoTimeoutException: Timed out after 3 ms while waiting for a 
server that matches ReadPreferenceServerSelector{readPreference=primary}. 
Client view of cluster state is {type=UNKNOWN, 
servers=[{address=127.0.0.1:27017, type=UNKNOWN, state=CONNECTING, 
exception={com.mongodb.MongoSocketOpenException: Exception opening socket}, 
caused by {java.net.ConnectException: Connection refused}}]] with root cause
com.mongodb.MongoTimeoutException: Timed out after 3 ms while waiting for a 
server that matches ReadPreferenceServerSelector{readPreference=primary}. 
Client view of cluster state is {type=UNKNOWN, 
servers=[{address=127.0.0.1:27017, type=UNKNOWN, state=CONNECTING, 
exception={com.mongodb.MongoSocketOpenException: Exception opening socket}, 
caused by {java.net.ConnectException: Connection refused}}]
at 
com.mongodb.connection.BaseCluster.createTimeoutException(BaseCluster.java:369)
at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:101)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:75)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:71)
at 
com.mongodb.binding.ClusterBinding.getReadConnectionSource(ClusterBinding.java:63)
at 
com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:89)
at 
com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:84)
at 
com.mongodb.operation.CommandReadOperation.execute(CommandReadOperation.java:55)
at com.mongodb.Mongo.execute(Mongo.java:772)
at 

[1/2] incubator-beam git commit: [BEAM-905] Add shading config to examples archetype and enable it for Flink

2016-12-07 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/master 5b31a3699 -> b44a7ac4a


[BEAM-905] Add shading config to examples archetype and enable it for Flink

This makes the Flink quickstart work out of the box.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/43fef277
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/43fef277
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/43fef277

Branch: refs/heads/master
Commit: 43fef2775145f67def3ab8a246ecca192a7d650b
Parents: 5b31a36
Author: Dan Halperin 
Authored: Wed Dec 7 20:06:57 2016 +0800
Committer: Dan Halperin 
Committed: Wed Dec 7 14:55:02 2016 -0800

--
 .../main/resources/archetype-resources/pom.xml  | 40 
 1 file changed, 40 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/43fef277/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml
--
diff --git 
a/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml
 
b/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml
index df2e9f3..95d163c 100644
--- 
a/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml
+++ 
b/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml
@@ -85,6 +85,38 @@
 false
   
 
+
+
+
+  org.apache.maven.plugins
+  maven-shade-plugin
+  2.4.1
+  
+
+  package
+  
+shade
+  
+  
+
${project.artifactId}-bundled-${project.version}
+
+  
+*:*
+
+  META-INF/LICENSE
+  META-INF/*.SF
+  META-INF/*.DSA
+  META-INF/*.RSA
+
+  
+
+  
+
+  
+
   
 
   
@@ -140,6 +172,14 @@
   runtime
 
   
+  
+
+  
+org.apache.maven.plugins
+maven-shade-plugin
+  
+
+  
 
 
 



[jira] [Commented] (BEAM-25) Add user-ready API for interacting with state

2016-12-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-25?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15730142#comment-15730142
 ] 

ASF GitHub Bot commented on BEAM-25:


GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/1543

[BEAM-25] Move CopyOnAccessStateInternals to runners/direct

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

R: @tgroh this is only actually used by the direct runner. Not necessarily 
the greatest JIRA for this, but I'm not sure of a blanket for cleaning out the 
SDK's excessive surface area, so I went with the state ticket.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam 
CopyOnAccessStateInternals

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1543.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1543


commit 019612ba5e5a656e848d458617007d39be42b3e9
Author: Kenneth Knowles 
Date:   2016-12-07T22:28:39Z

Move CopyOnAccessStateInternals to runners/direct




> Add user-ready API for interacting with state
> -
>
> Key: BEAM-25
> URL: https://issues.apache.org/jira/browse/BEAM-25
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>  Labels: State
>
> Our current state API is targeted at runner implementers, not pipeline 
> authors. As such it has many capabilities that are not necessary nor 
> desirable for simple use cases of stateful ParDo (such as dynamic state tag 
> creation). Implement a simple state intended for user access.
> (Details of our current thoughts in forthcoming design doc)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1543: [BEAM-25] Move CopyOnAccessStateInternals...

2016-12-07 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/1543

[BEAM-25] Move CopyOnAccessStateInternals to runners/direct

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

R: @tgroh this is only actually used by the direct runner. Not necessarily 
the greatest JIRA for this, but I'm not sure of a blanket for cleaning out the 
SDK's excessive surface area, so I went with the state ticket.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam 
CopyOnAccessStateInternals

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1543.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1543


commit 019612ba5e5a656e848d458617007d39be42b3e9
Author: Kenneth Knowles 
Date:   2016-12-07T22:28:39Z

Move CopyOnAccessStateInternals to runners/direct




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-beam pull request #1542: Fix a typo in query split error handling

2016-12-07 Thread vikkyrk
GitHub user vikkyrk opened a pull request:

https://github.com/apache/incubator-beam/pull/1542

Fix a typo in query split error handling

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
- Also add unit test to catch this bug. 


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vikkyrk/incubator-beam py_ds_split_query_fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1542.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1542


commit 72b8d6829f420bacf09c3d0340e578be36c50f13
Author: Vikas Kedigehalli 
Date:   2016-12-07T22:19:26Z

Fix a typo in query split error handling




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-498) Make DoFnWithContext the new DoFn

2016-12-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15730125#comment-15730125
 ] 

ASF GitHub Bot commented on BEAM-498:
-

GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/1541

[BEAM-498] Move PerKeyCombineFnRunner to runners-core

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

R: @peihe seems there is no reason this needs to stay, right? 
`PerKeyCombineFnRunners` (the static util class) is already moved.

R: @lukecwik this has some references to `OldDoFn` so at least this gets 
them out of the SDK.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam 
PerKeyCombineFnRunner

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1541.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1541


commit 351f58567212e7a9d4664ca65a0bd4a10a77ed81
Author: Kenneth Knowles 
Date:   2016-12-07T22:22:21Z

Move PerKeyCombineFnRunner to runners-core




> Make DoFnWithContext the new DoFn
> -
>
> Key: BEAM-498
> URL: https://issues.apache.org/jira/browse/BEAM-498
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>  Labels: backward-incompatible
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1541: [BEAM-498] Move PerKeyCombineFnRunner to ...

2016-12-07 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/1541

[BEAM-498] Move PerKeyCombineFnRunner to runners-core

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

R: @peihe seems there is no reason this needs to stay, right? 
`PerKeyCombineFnRunners` (the static util class) is already moved.

R: @lukecwik this has some references to `OldDoFn` so at least this gets 
them out of the SDK.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam 
PerKeyCombineFnRunner

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1541.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1541


commit 351f58567212e7a9d4664ca65a0bd4a10a77ed81
Author: Kenneth Knowles 
Date:   2016-12-07T22:22:21Z

Move PerKeyCombineFnRunner to runners-core




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-beam pull request #1540: Add more documentation to datastore_wordc...

2016-12-07 Thread vikkyrk
GitHub user vikkyrk opened a pull request:

https://github.com/apache/incubator-beam/pull/1540

Add more documentation to datastore_wordcount example

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vikkyrk/incubator-beam py_ds_doc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1540.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1540


commit 93afecb166cf4f1515007db5d803a6945fcb668e
Author: Vikas Kedigehalli 
Date:   2016-12-07T22:14:41Z

Add more documentation to datastore_wordcount example




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-498) Make DoFnWithContext the new DoFn

2016-12-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15730120#comment-15730120
 ] 

ASF GitHub Bot commented on BEAM-498:
-

GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/1539

[BEAM-498] Remove misc occurrences of OldDoFn

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

R: @lukecwik here's a few more

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam remove-OldDoFn

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1539.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1539


commit 4c178c4dbfbdbcb5e5e3100a14807be7dfda62be
Author: Kenneth Knowles 
Date:   2016-12-07T22:17:01Z

Remove misc occurrences of OldDoFn




> Make DoFnWithContext the new DoFn
> -
>
> Key: BEAM-498
> URL: https://issues.apache.org/jira/browse/BEAM-498
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>  Labels: backward-incompatible
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1539: [BEAM-498] Remove misc occurrences of Old...

2016-12-07 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/1539

[BEAM-498] Remove misc occurrences of OldDoFn

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

R: @lukecwik here's a few more

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam remove-OldDoFn

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1539.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1539


commit 4c178c4dbfbdbcb5e5e3100a14807be7dfda62be
Author: Kenneth Knowles 
Date:   2016-12-07T22:17:01Z

Remove misc occurrences of OldDoFn




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-beam pull request #1506: Improve BigQuery load error message

2016-12-07 Thread sammcveety
Github user sammcveety closed the pull request at:

https://github.com/apache/incubator-beam/pull/1506


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-507) Fill in the documentation/runners/spark portion of the website

2016-12-07 Thread Amit Sela (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15730054#comment-15730054
 ] 

Amit Sela commented on BEAM-507:


I'll have a PR by tomorrow.

> Fill in the documentation/runners/spark portion of the website
> --
>
> Key: BEAM-507
> URL: https://issues.apache.org/jira/browse/BEAM-507
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Frances Perry
>Assignee: Amit Sela
>
> As per 
> https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit.
> Should be a landing page for Spark-specific information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (BEAM-438) Rename one of PTransform.apply and PInput.apply

2016-12-07 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-438:


Assignee: Kenneth Knowles

> Rename one of PTransform.apply and PInput.apply
> ---
>
> Key: BEAM-438
> URL: https://issues.apache.org/jira/browse/BEAM-438
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Daniel Halperin
>Assignee: Kenneth Knowles
>  Labels: backward-incompatible
>
> Before releasing Beam 1.0, we should do this.
> Right now, it's legal to call:
> {{ptransform.apply(input)}}
> and 
> {{input.apply(ptransform)}}
> when only the latter is correct. The former skips various validation steps 
> and loses the notion of composite transforms.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1538: [BEAM-438] Rename PTransform.apply to PTr...

2016-12-07 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/1538

[BEAM-438] Rename PTransform.apply to PTransform.expand

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

Opening this to have a code pointer in dev list discussion.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam PTransform-expand

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1538.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1538


commit ff50e4d470d954c17d73358aa3e0c4c8b4123b87
Author: Kenneth Knowles 
Date:   2016-12-07T21:33:04Z

Rename PTransform.apply to PTransform.expand




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-438) Rename one of PTransform.apply and PInput.apply

2016-12-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15729977#comment-15729977
 ] 

ASF GitHub Bot commented on BEAM-438:
-

GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/1538

[BEAM-438] Rename PTransform.apply to PTransform.expand

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

Opening this to have a code pointer in dev list discussion.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam PTransform-expand

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1538.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1538


commit ff50e4d470d954c17d73358aa3e0c4c8b4123b87
Author: Kenneth Knowles 
Date:   2016-12-07T21:33:04Z

Rename PTransform.apply to PTransform.expand




> Rename one of PTransform.apply and PInput.apply
> ---
>
> Key: BEAM-438
> URL: https://issues.apache.org/jira/browse/BEAM-438
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Daniel Halperin
>  Labels: backward-incompatible
>
> Before releasing Beam 1.0, we should do this.
> Right now, it's legal to call:
> {{ptransform.apply(input)}}
> and 
> {{input.apply(ptransform)}}
> when only the latter is correct. The former skips various validation steps 
> and loses the notion of composite transforms.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-1090) High memory usage error

2016-12-07 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-1090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15729950#comment-15729950
 ] 

María GH commented on BEAM-1090:


I had another occurrence: 
https://travis-ci.org/apache/incubator-beam/jobs/181976745

> High memory usage error
> ---
>
> Key: BEAM-1090
> URL: https://issues.apache.org/jira/browse/BEAM-1090
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Affects Versions: 0.3.0-incubating
>Reporter: María GH
>Priority: Minor
>
> Non-reproducible high memory usage test failure. It goes away on its own.
> RuntimeError: High memory usage: 201418866688 > 201008464768 [while running 
> 'oom:check']
> root: WARNING: A task failed with exception.
>  High memory usage: 201418866688 > 201008464768 [while running 'oom:check']
> ---
> Complete results at https://travis-ci.org/apache/incubator-beam/jobs/181011669



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: beam_PostCommit_Python_Verify #839

2016-12-07 Thread Apache Jenkins Server
See 

--
[...truncated 3006 lines...]
==
ERROR: test_par_do_with_multiple_outputs_and_using_yield 
(apache_beam.dataflow_test.DataflowTest)
--
Traceback (most recent call last):
  File 
"
 line 812, in run
test(orig)
  File 
"
 line 45, in __call__
return self.run(*arg, **kwarg)
  File 
"
 line 133, in run
self.runTest(result)
  File 
"
 line 151, in runTest
test(result)
  File "/usr/lib/python2.7/unittest/case.py", line 395, in __call__
return self.run(*args, **kwds)
  File "/usr/lib/python2.7/unittest/case.py", line 331, in run
testMethod()
  File 
"
 line 147, in test_par_do_with_multiple_outputs_and_using_yield
pipeline.run()
  File 
"
 line 159, in run
return self.runner.run(self)
  File 
"
 line 175, in run
pipeline.options, job_version)
  File 
"
 line 360, in __init__
credentials = get_service_credentials()
  File 
"
 line 171, in get_service_credentials
credentials.get_access_token()
  File "build/bdist.linux-x86_64/egg/oauth2client/client.py", line 677, in 
get_access_token
self.refresh(http)
  File "build/bdist.linux-x86_64/egg/oauth2client/client.py", line 560, in 
refresh
self._refresh(http.request)
  File 
"
 line 109, in _refresh
['gcloud', 'auth', 'print-access-token'], stdout=processes.PIPE)
  File 
"
 line 52, in Popen
return subprocess.Popen(*args, **kwargs)
  File "/usr/lib/python2.7/subprocess.py", line 710, in __init__
errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1308, in _execute_child
data = _eintr_retry_call(os.read, errpipe_read, 1048576)
  File "/usr/lib/python2.7/subprocess.py", line 476, in _eintr_retry_call
return func(*args)
  File 
"
 line 276, in signalhandler
raise TimedOutException()
TimedOutException: 'test_par_do_with_multiple_outputs_and_using_yield 
(apache_beam.dataflow_test.DataflowTest)'

==
ERROR: test_par_do_with_side_input_as_arg 
(apache_beam.dataflow_test.DataflowTest)
--
Traceback (most recent call last):
  File 
"
 line 812, in run
test(orig)
  File 
"
 line 45, in __call__
return self.run(*arg, **kwarg)
  File 
"
 line 133, in run
self.runTest(result)
  File 
"
 line 151, in runTest
test(result)
  File "/usr/lib/python2.7/unittest/case.py", line 395, in __call__
return self.run(*args, **kwds)
  File "/usr/lib/python2.7/unittest/case.py", line 331, in run
testMethod()
  File 
"
 line 94, in test_par_do_with_side_input_as_arg
pipeline.run()
  File 
"
 line 159, in run
return self.runner.run(self)
  File 

[jira] [Commented] (BEAM-597) Provide type information from Coders

2016-12-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15729854#comment-15729854
 ] 

ASF GitHub Bot commented on BEAM-597:
-

GitHub user jeremiele opened a pull request:

https://github.com/apache/incubator-beam/pull/1537

[BEAM-597] Added a new method on Coder which returns a TypeDescriptor.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

The new method allows returning type information about the data being
encoded and decoded by a Coder.
Added a default implementation to StandardCoder which returns the
TypeDescriptor for Object to ease the transition and avoid breaking
implementations relying on StandardCoder or AtomicCoder.

This will break classes implementing the Coder interface directly.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jeremiele/incubator-beam 
add_method_to_coder_for_typedescriptor

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1537.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1537


commit 4a892401689fd01a831ff8904e46f917f8f1
Author: Jeremie Lenfant-Engelmann 
Date:   2016-12-07T19:29:29Z

Added a new method on Coder which returns a TypeDescriptor.

The new method allows returning type information about the data being
encoded and decoded by a Coder.
Added a default implementation to StandardCoder which returns the
TypeDescriptor for Object to ease the transition and avoid breaking
implementations relying on StandardCoder or AtomicCoder.

This will break classes implementing the Coder interface directly.




> Provide type information from Coders
> 
>
> Key: BEAM-597
> URL: https://issues.apache.org/jira/browse/BEAM-597
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Jeremie Lenfant-Engelmann
>Assignee: Jeremie Lenfant-Engelmann
>Priority: Minor
>
> The Coder interface should return a TypeDescriptor describing the type that 
> is currently encoded/decoded by the Coder.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1537: [BEAM-597] Added a new method on Coder wh...

2016-12-07 Thread jeremiele
GitHub user jeremiele opened a pull request:

https://github.com/apache/incubator-beam/pull/1537

[BEAM-597] Added a new method on Coder which returns a TypeDescriptor.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

The new method allows returning type information about the data being
encoded and decoded by a Coder.
Added a default implementation to StandardCoder which returns the
TypeDescriptor for Object to ease the transition and avoid breaking
implementations relying on StandardCoder or AtomicCoder.

This will break classes implementing the Coder interface directly.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jeremiele/incubator-beam 
add_method_to_coder_for_typedescriptor

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1537.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1537


commit 4a892401689fd01a831ff8904e46f917f8f1
Author: Jeremie Lenfant-Engelmann 
Date:   2016-12-07T19:29:29Z

Added a new method on Coder which returns a TypeDescriptor.

The new method allows returning type information about the data being
encoded and decoded by a Coder.
Added a default implementation to StandardCoder which returns the
TypeDescriptor for Object to ease the transition and avoid breaking
implementations relying on StandardCoder or AtomicCoder.

This will break classes implementing the Coder interface directly.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Build failed in Jenkins: beam_PostCommit_Python_Verify #838

2016-12-07 Thread Apache Jenkins Server
See 

Changes:

[robertwb] [BEAM-1077] @ValidatesRunner Test in Python Postcommit

--
[...truncated 2958 lines...]
  File 
"
 line 276, in signalhandler
raise TimedOutException()
TimedOutException: 'test_par_do_with_multiple_outputs_and_using_return 
(apache_beam.dataflow_test.DataflowTest)'

==
ERROR: test_par_do_with_multiple_outputs_and_using_yield 
(apache_beam.dataflow_test.DataflowTest)
--
Traceback (most recent call last):
  File 
"
 line 812, in run
test(orig)
  File 
"
 line 45, in __call__
return self.run(*arg, **kwarg)
  File 
"
 line 133, in run
self.runTest(result)
  File 
"
 line 151, in runTest
test(result)
  File "/usr/lib/python2.7/unittest/case.py", line 395, in __call__
return self.run(*args, **kwds)
  File "/usr/lib/python2.7/unittest/case.py", line 331, in run
testMethod()
  File 
"
 line 147, in test_par_do_with_multiple_outputs_and_using_yield
pipeline.run()
  File 
"
 line 159, in run
return self.runner.run(self)
  File 
"
 line 175, in run
pipeline.options, job_version)
  File 
"
 line 360, in __init__
credentials = get_service_credentials()
  File 
"
 line 171, in get_service_credentials
credentials.get_access_token()
  File "build/bdist.linux-x86_64/egg/oauth2client/client.py", line 677, in 
get_access_token
self.refresh(http)
  File "build/bdist.linux-x86_64/egg/oauth2client/client.py", line 560, in 
refresh
self._refresh(http.request)
  File 
"
 line 113, in _refresh
output, _ = gcloud_process.communicate()
  File "/usr/lib/python2.7/subprocess.py", line 791, in communicate
stdout = _eintr_retry_call(self.stdout.read)
  File "/usr/lib/python2.7/subprocess.py", line 476, in _eintr_retry_call
return func(*args)
  File 
"
 line 276, in signalhandler
raise TimedOutException()
TimedOutException: 'test_par_do_with_multiple_outputs_and_using_yield 
(apache_beam.dataflow_test.DataflowTest)'

==
ERROR: test_par_do_with_side_input_as_arg 
(apache_beam.dataflow_test.DataflowTest)
--
Traceback (most recent call last):
  File 
"
 line 812, in run
test(orig)
  File 
"
 line 45, in __call__
return self.run(*arg, **kwarg)
  File 
"
 line 133, in run
self.runTest(result)
  File 
"
 line 151, in runTest
test(result)
  File "/usr/lib/python2.7/unittest/case.py", line 395, in __call__
return self.run(*args, **kwds)
  File "/usr/lib/python2.7/unittest/case.py", line 331, in run
testMethod()
  File 
"
 line 94, in test_par_do_with_side_input_as_arg
pipeline.run()
  File 
"
 line 159, in run
return 

[1/2] incubator-beam git commit: Closes #1492

2016-12-07 Thread robertwb
Repository: incubator-beam
Updated Branches:
  refs/heads/python-sdk d5e8c79a3 -> 4a660c604


Closes #1492


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/4a660c60
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/4a660c60
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/4a660c60

Branch: refs/heads/python-sdk
Commit: 4a660c604a0b21f685d87b8ecc008aeb13bb4049
Parents: d5e8c79 c7626ad
Author: Robert Bradshaw 
Authored: Wed Dec 7 12:14:38 2016 -0800
Committer: Robert Bradshaw 
Committed: Wed Dec 7 12:14:38 2016 -0800

--
 sdks/python/run_postcommit.sh | 36 ++--
 1 file changed, 26 insertions(+), 10 deletions(-)
--




[jira] [Commented] (BEAM-884) Add Display Data to the Python SDK's PipelineOptions, Avro io and other transforms

2016-12-07 Thread Pablo Estrada (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15729725#comment-15729725
 ] 

Pablo Estrada commented on BEAM-884:


This is done.

> Add Display Data to the Python SDK's PipelineOptions, Avro io and other 
> transforms
> --
>
> Key: BEAM-884
> URL: https://issues.apache.org/jira/browse/BEAM-884
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Reporter: Pablo Estrada
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-1106) Remove no_pipeline_type_check flag from Python SDK

2016-12-07 Thread Pablo Estrada (JIRA)
Pablo Estrada created BEAM-1106:
---

 Summary: Remove no_pipeline_type_check flag from Python SDK
 Key: BEAM-1106
 URL: https://issues.apache.org/jira/browse/BEAM-1106
 Project: Beam
  Issue Type: Bug
  Components: sdk-py
Reporter: Pablo Estrada
Assignee: Frances Perry


It's already the default behavior. It should be possible to remove it without 
trouble.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-824) Misleading error message when sdk_location is missing in python

2016-12-07 Thread Pablo Estrada (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15729724#comment-15729724
 ] 

Pablo Estrada commented on BEAM-824:


This was fixed.

> Misleading error message when sdk_location is missing in python
> ---
>
> Key: BEAM-824
> URL: https://issues.apache.org/jira/browse/BEAM-824
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Reporter: Pablo Estrada
>
> When trying to submit jobs to the Cloud Dataflow service using the Python 
> SDK, the sdk_location should be provided or the serive errors out saying that 
> package google-cloud-dataflow is missing.
> We might want to prompt users to add sdk_location parameter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-1055) Display Data keys on Python are inconsistent

2016-12-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15729720#comment-15729720
 ] 

ASF GitHub Bot commented on BEAM-1055:
--

Github user pabloem closed the pull request at:

https://github.com/apache/incubator-beam/pull/1443


> Display Data keys on Python are inconsistent
> 
>
> Key: BEAM-1055
> URL: https://issues.apache.org/jira/browse/BEAM-1055
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Pablo Estrada
>
> Some are in camelCase, some are in snake_case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-722) Add Display Data to the Python SDK

2016-12-07 Thread Pablo Estrada (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15729723#comment-15729723
 ] 

Pablo Estrada commented on BEAM-722:


This is done.

> Add Display Data to the Python SDK
> --
>
> Key: BEAM-722
> URL: https://issues.apache.org/jira/browse/BEAM-722
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py
>Reporter: Pablo Estrada
>Assignee: Frances Perry
>
> The DisplayData feature has been added to the Java SDK (see blog post 
> announcing it: 
> https://cloud.google.com/blog/big-data/2016/06/dataflow-updates-see-more-details-about-your-pipelines).
>  We need now to add it to the Python SDK.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-1055) Display Data keys on Python are inconsistent

2016-12-07 Thread Pablo Estrada (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15729721#comment-15729721
 ] 

Pablo Estrada commented on BEAM-1055:
-

This is done.

> Display Data keys on Python are inconsistent
> 
>
> Key: BEAM-1055
> URL: https://issues.apache.org/jira/browse/BEAM-1055
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Pablo Estrada
>
> Some are in camelCase, some are in snake_case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1443: [BEAM-1055] Display Data keys on Python a...

2016-12-07 Thread pabloem
Github user pabloem closed the pull request at:

https://github.com/apache/incubator-beam/pull/1443


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1105) Adding Beam's pico Wordcount to the existing examples.

2016-12-07 Thread Jesse Anderson (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15729610#comment-15729610
 ] 

Jesse Anderson commented on BEAM-1105:
--

Sounds good to me.

> Adding Beam's pico Wordcount to the existing examples. 
> ---
>
> Key: BEAM-1105
> URL: https://issues.apache.org/jira/browse/BEAM-1105
> Project: Beam
>  Issue Type: Wish
>Reporter: Neelesh Srinivas Salian
>Assignee: Neelesh Srinivas Salian
>Priority: Minor
>  Labels: examples
>
> http://www.jesse-anderson.com/2016/12/beams-pico-wordcount/
> Is a good explanation for the WordCount that would encourage users.
> Adding this to the examples and subsequently the docs is a good step to help 
> new users start from a good foundation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-1065) FileBasedSource: replace SeekableByteChannel with open(spec, startingPosition)

2016-12-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15729597#comment-15729597
 ] 

ASF GitHub Bot commented on BEAM-1065:
--

Github user peihe closed the pull request at:

https://github.com/apache/incubator-beam/pull/1426


> FileBasedSource: replace SeekableByteChannel with open(spec, startingPosition)
> --
>
> Key: BEAM-1065
> URL: https://issues.apache.org/jira/browse/BEAM-1065
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Pei He
>Assignee: Pei He
>
> FileBasedReader should be able to open the file with the 
> Source.getStartOffset(), and then read forward to find the first input 
> element.
> The benefits are:
> 1. It is easier to implement a ReadableByteChannel.
> 2. Dynamically splitting won't require file systems to support seeking.
> 3. Doesn't need to seek to position twice, which is what current API does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1426: [BEAM-1065] FileBasedSource: replace Seek...

2016-12-07 Thread peihe
Github user peihe closed the pull request at:

https://github.com/apache/incubator-beam/pull/1426


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-664) Port Dataflow SDK WordCount walkthrough to Beam site

2016-12-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15729567#comment-15729567
 ] 

ASF GitHub Bot commented on BEAM-664:
-

GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/1536

[BEAM-664] Revise WindowedWordCount example to be more independent of 
runner and execution style

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This removes the use of BigQuery from the WindowedWordCount example, 
replacing it with a somewhat hacky file-based write of the output, using the 
window as the idempotency key. In order to port the test and to benefit from 
recent improvements in `FileBasedCheckSumMatcher`, I've factored out the 
resiliency code from that into an internal-only minimal `ShardedFile` class 
with just enough API surface to write these tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam WindowedWordCount

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1536.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1536


commit bac7b192c9fefdf536e324f1fde73bac2cd903fa
Author: Kenneth Knowles 
Date:   2016-11-04T03:44:45Z

Add IntervalWindow coder to the standard registry

commit aa09449379dfaedbdfb0b73bae85ffd334b838b1
Author: Kenneth Knowles 
Date:   2016-11-03T21:37:26Z

Revise WindowedWordCount for runner and execution mode portability

commit 357732efb866e4a24d76a9aaafa10e6bc964fe7e
Author: Kenneth Knowles 
Date:   2016-11-08T06:06:00Z

Check the onSuccessMatcher in DirectRunner if isBlockOnRun is set

commit 76d19716829cb98c4f0d4b34244fde8866b6d6a9
Author: Kenneth Knowles 
Date:   2016-12-05T22:32:12Z

Factor out ShardedFile from FileChecksumMatcher




> Port Dataflow SDK WordCount walkthrough to Beam site
> 
>
> Key: BEAM-664
> URL: https://issues.apache.org/jira/browse/BEAM-664
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Hadar Hod
>Assignee: Hadar Hod
>
> Port the WordCount walkthrough from Dataflow docs to Beam website. 
> * Copy prose (translate from html to md, remove Dataflow references, etc)
> * Add accurate "How to Run" instructions for each of the WC examples
> * Include code snippets from real examples



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1536: [BEAM-664] Revise WindowedWordCount examp...

2016-12-07 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/1536

[BEAM-664] Revise WindowedWordCount example to be more independent of 
runner and execution style

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This removes the use of BigQuery from the WindowedWordCount example, 
replacing it with a somewhat hacky file-based write of the output, using the 
window as the idempotency key. In order to port the test and to benefit from 
recent improvements in `FileBasedCheckSumMatcher`, I've factored out the 
resiliency code from that into an internal-only minimal `ShardedFile` class 
with just enough API surface to write these tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam WindowedWordCount

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1536.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1536


commit bac7b192c9fefdf536e324f1fde73bac2cd903fa
Author: Kenneth Knowles 
Date:   2016-11-04T03:44:45Z

Add IntervalWindow coder to the standard registry

commit aa09449379dfaedbdfb0b73bae85ffd334b838b1
Author: Kenneth Knowles 
Date:   2016-11-03T21:37:26Z

Revise WindowedWordCount for runner and execution mode portability

commit 357732efb866e4a24d76a9aaafa10e6bc964fe7e
Author: Kenneth Knowles 
Date:   2016-11-08T06:06:00Z

Check the onSuccessMatcher in DirectRunner if isBlockOnRun is set

commit 76d19716829cb98c4f0d4b34244fde8866b6d6a9
Author: Kenneth Knowles 
Date:   2016-12-05T22:32:12Z

Factor out ShardedFile from FileChecksumMatcher




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-beam pull request #1535: Update com.google.auto.value.version to 1...

2016-12-07 Thread joshualitt
Github user joshualitt closed the pull request at:

https://github.com/apache/incubator-beam/pull/1535


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1105) Adding Beam's pico Wordcount to the existing examples.

2016-12-07 Thread Neelesh Srinivas Salian (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15729486#comment-15729486
 ] 

Neelesh Srinivas Salian commented on BEAM-1105:
---

[~jbonofre] I agree. Makes sense to bundle them up together.
I'll start a thread and we can think it over. I'm thinking a user - specific 
examples bundle / module would be best to gather all of them.

Will see if others have ideas.

> Adding Beam's pico Wordcount to the existing examples. 
> ---
>
> Key: BEAM-1105
> URL: https://issues.apache.org/jira/browse/BEAM-1105
> Project: Beam
>  Issue Type: Wish
>Reporter: Neelesh Srinivas Salian
>Assignee: Neelesh Srinivas Salian
>Priority: Minor
>  Labels: examples
>
> http://www.jesse-anderson.com/2016/12/beams-pico-wordcount/
> Is a good explanation for the WordCount that would encourage users.
> Adding this to the examples and subsequently the docs is a good step to help 
> new users start from a good foundation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-1105) Adding Beam's pico Wordcount to the existing examples.

2016-12-07 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15729476#comment-15729476
 ] 

Jean-Baptiste Onofré commented on BEAM-1105:


FYI, I also have concrete samples here: https://github.com/jbonofre/beam-samples

I think we should have a discussion about that we provide.

Currently, the examples are also used to test the runners.

IMHO, Jesse's example is a sample, end-user oriented. So, it's a good candidate 
to be gather with beam-samples all together.

> Adding Beam's pico Wordcount to the existing examples. 
> ---
>
> Key: BEAM-1105
> URL: https://issues.apache.org/jira/browse/BEAM-1105
> Project: Beam
>  Issue Type: Wish
>Reporter: Neelesh Srinivas Salian
>Assignee: Neelesh Srinivas Salian
>Priority: Minor
>  Labels: examples
>
> http://www.jesse-anderson.com/2016/12/beams-pico-wordcount/
> Is a good explanation for the WordCount that would encourage users.
> Adding this to the examples and subsequently the docs is a good step to help 
> new users start from a good foundation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-1105) Adding Beam's pico Wordcount to the existing examples.

2016-12-07 Thread Neelesh Srinivas Salian (JIRA)
Neelesh Srinivas Salian created BEAM-1105:
-

 Summary: Adding Beam's pico Wordcount to the existing examples. 
 Key: BEAM-1105
 URL: https://issues.apache.org/jira/browse/BEAM-1105
 Project: Beam
  Issue Type: Wish
Reporter: Neelesh Srinivas Salian
Assignee: Neelesh Srinivas Salian
Priority: Minor


http://www.jesse-anderson.com/2016/12/beams-pico-wordcount/
Is a good explanation for the WordCount that would encourage users.

Adding this to the examples and subsequently the docs is a good step to help 
new users start from a good foundation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1535: Update com.google.auto.value.version to 1...

2016-12-07 Thread joshualitt
GitHub user joshualitt opened a pull request:

https://github.com/apache/incubator-beam/pull/1535

Update com.google.auto.value.version to 1.4-rc1





You can merge this pull request into a Git repository by running:

$ git pull https://github.com/joshualitt/incubator-beam update_pom

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1535.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1535


commit f183c21a6390f69836ac1c9b9111b432b19c0bbe
Author: Joshua Litt 
Date:   2016-12-07T18:13:56Z

Update com.google.auto.value.version to 1.4-rc1




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-646) Get runners out of the apply()

2016-12-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15729312#comment-15729312
 ] 

ASF GitHub Bot commented on BEAM-646:
-

Github user tgroh closed the pull request at:

https://github.com/apache/incubator-beam/pull/1442


> Get runners out of the apply()
> --
>
> Key: BEAM-646
> URL: https://issues.apache.org/jira/browse/BEAM-646
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Thomas Groh
>
> Right now, the runner intercepts calls to apply() and replaces transforms as 
> we go. This means that there is no "original" user graph. For portability and 
> misc architectural benefits, we would like to build the original graph first, 
> and have the runner override later.
> Some runners already work in this manner, but we could integrate it more 
> smoothly, with more validation, via some handy APIs on e.g. the Pipeline 
> object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1442: [BEAM-646] Add Replacement Methods to Tra...

2016-12-07 Thread tgroh
Github user tgroh closed the pull request at:

https://github.com/apache/incubator-beam/pull/1442


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-115) Beam Runner API

2016-12-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15729306#comment-15729306
 ] 

ASF GitHub Bot commented on BEAM-115:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1511


> Beam Runner API
> ---
>
> Key: BEAM-115
> URL: https://issues.apache.org/jira/browse/BEAM-115
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>
> The PipelineRunner API from the SDK is not ideal for the Beam technical 
> vision.
> It has technical limitations:
>  - The user's DAG (even including library expansions) is never explicitly 
> represented, so it cannot be analyzed except incrementally, and cannot 
> necessarily be reconstructed (for example, to display it!).
>  - The flattened DAG of just primitive transforms isn't well-suited for 
> display or transform override.
>  - The TransformHierarchy isn't well-suited for optimizations.
>  - The user must realistically pre-commit to a runner, and its configuration 
> (batch vs streaming) prior to graph construction, since the runner will be 
> modifying the graph as it is built.
>  - It is fairly language- and SDK-specific.
> It has usability issues (these are not from intuition, but derived from 
> actual cases of failure to use according to the design)
>  - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner 
> is confusing.
>  - The TransformHierarchy, accessible only via visitor traversals, is 
> cumbersome.
>  - The staging of construction-time vs run-time is not always obvious.
> These are just examples. This ticket tracks designing, coming to consensus, 
> and building an API that more simply and directly supports the technical 
> vision.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1511: [BEAM-115] Only provide expanded Inputs a...

2016-12-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1511


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] incubator-beam git commit: This closes #1511

2016-12-07 Thread tgroh
This closes #1511


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/5b31a369
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/5b31a369
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/5b31a369

Branch: refs/heads/master
Commit: 5b31a369962907e257de8019fbf6cde4c615b1c0
Parents: ae52ec1 55d333b
Author: Thomas Groh 
Authored: Wed Dec 7 09:14:38 2016 -0800
Committer: Thomas Groh 
Committed: Wed Dec 7 09:14:38 2016 -0800

--
 .../apex/translation/TranslationContext.java|  4 +--
 .../beam/runners/direct/DirectGraphVisitor.java |  9 +++
 .../direct/KeyedPValueTrackingVisitor.java  |  2 +-
 .../FlinkBatchPipelineTranslator.java   |  4 +--
 .../FlinkStreamingPipelineTranslator.java   |  7 ++
 .../dataflow/DataflowPipelineTranslator.java|  3 +--
 .../apache/beam/runners/spark/SparkRunner.java  | 17 +++--
 .../beam/sdk/runners/TransformHierarchy.java| 26 +++-
 .../sdk/runners/TransformHierarchyTest.java | 13 --
 9 files changed, 38 insertions(+), 47 deletions(-)
--




[1/2] incubator-beam git commit: Only provide expanded Inputs and Outputs

2016-12-07 Thread tgroh
Repository: incubator-beam
Updated Branches:
  refs/heads/master ae52ec1bc -> 5b31a3699


Only provide expanded Inputs and Outputs

This removes PInput and POutput from the immediate API Surface of
TransformHierarchy.Node, and forces Pipeline Visitors to access only
the expanded version of the output.

This is part of the move towards the runner-agnostic representation of a
graph.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/55d333bf
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/55d333bf
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/55d333bf

Branch: refs/heads/master
Commit: 55d333bff68809ff1a9154491ace02d2d16e3b85
Parents: ae52ec1
Author: Thomas Groh 
Authored: Mon Dec 5 14:29:05 2016 -0800
Committer: Thomas Groh 
Committed: Wed Dec 7 09:14:18 2016 -0800

--
 .../apex/translation/TranslationContext.java|  4 +--
 .../beam/runners/direct/DirectGraphVisitor.java |  9 +++
 .../direct/KeyedPValueTrackingVisitor.java  |  2 +-
 .../FlinkBatchPipelineTranslator.java   |  4 +--
 .../FlinkStreamingPipelineTranslator.java   |  7 ++
 .../dataflow/DataflowPipelineTranslator.java|  3 +--
 .../apache/beam/runners/spark/SparkRunner.java  | 17 +++--
 .../beam/sdk/runners/TransformHierarchy.java| 26 +++-
 .../sdk/runners/TransformHierarchyTest.java | 13 --
 9 files changed, 38 insertions(+), 47 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/55d333bf/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/TranslationContext.java
--
diff --git 
a/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/TranslationContext.java
 
b/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/TranslationContext.java
index 259afbd..3bf01a8 100644
--- 
a/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/TranslationContext.java
+++ 
b/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/TranslationContext.java
@@ -35,7 +35,6 @@ import 
org.apache.beam.runners.apex.translation.utils.CoderAdapterStreamCodec;
 import org.apache.beam.sdk.coders.Coder;
 import org.apache.beam.sdk.runners.TransformHierarchy;
 import org.apache.beam.sdk.transforms.AppliedPTransform;
-import org.apache.beam.sdk.transforms.PTransform;
 import org.apache.beam.sdk.util.WindowedValue.FullWindowedValueCoder;
 import org.apache.beam.sdk.util.state.StateInternalsFactory;
 import org.apache.beam.sdk.values.PCollection;
@@ -72,8 +71,7 @@ class TranslationContext {
   }
 
   public void setCurrentTransform(TransformHierarchy.Node treeNode) {
-this.currentTransform = AppliedPTransform.of(treeNode.getFullName(),
-treeNode.getInput(), treeNode.getOutput(), (PTransform) 
treeNode.getTransform());
+this.currentTransform = treeNode.toAppliedPTransform();
   }
 
   public ApexPipelineOptions getPipelineOptions() {

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/55d333bf/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectGraphVisitor.java
--
diff --git 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectGraphVisitor.java
 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectGraphVisitor.java
index cd9d120..4f38bce 100644
--- 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectGraphVisitor.java
+++ 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectGraphVisitor.java
@@ -79,13 +79,13 @@ class DirectGraphVisitor extends PipelineVisitor.Defaults {
 
   @Override
   public void visitPrimitiveTransform(TransformHierarchy.Node node) {
-toFinalize.removeAll(node.getInput().expand());
+toFinalize.removeAll(node.getInputs());
 AppliedPTransform appliedTransform = getAppliedTransform(node);
 stepNames.put(appliedTransform, genStepName());
-if (node.getInput().expand().isEmpty()) {
+if (node.getInputs().isEmpty()) {
   rootTransforms.add(appliedTransform);
 } else {
-  for (PValue value : node.getInput().expand()) {
+  for (PValue value : node.getInputs()) {
 primitiveConsumers.put(value, appliedTransform);
   }
 }
@@ -111,8 +111,7 @@ class DirectGraphVisitor extends PipelineVisitor.Defaults {
 
   private AppliedPTransform 
getAppliedTransform(TransformHierarchy.Node node) {
 @SuppressWarnings({"rawtypes", "unchecked"})
-AppliedPTransform application = AppliedPTransform.of(
-node.getFullName(), node.getInput(), node.getOutput(), (PTransform) 
node.getTransform());
+

[jira] [Commented] (BEAM-498) Make DoFnWithContext the new DoFn

2016-12-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15729262#comment-15729262
 ] 

ASF GitHub Bot commented on BEAM-498:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1529


> Make DoFnWithContext the new DoFn
> -
>
> Key: BEAM-498
> URL: https://issues.apache.org/jira/browse/BEAM-498
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>  Labels: backward-incompatible
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #1529: [BEAM-498] Port ParDoTest from OldDoFn to...

2016-12-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1529


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] incubator-beam git commit: Port ParDoTest from OldDoFn to new DoFn

2016-12-07 Thread kenn
Repository: incubator-beam
Updated Branches:
  refs/heads/master 1526184ae -> ae52ec1bc


Port ParDoTest from OldDoFn to new DoFn


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/8e1e46e7
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/8e1e46e7
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/8e1e46e7

Branch: refs/heads/master
Commit: 8e1e46e73edf9cce376ed7bd194db00edc3e60b4
Parents: 1526184
Author: Kenneth Knowles 
Authored: Tue Dec 6 21:01:37 2016 -0800
Committer: Kenneth Knowles 
Committed: Wed Dec 7 09:00:17 2016 -0800

--
 .../apache/beam/sdk/transforms/ParDoTest.java   | 238 +++
 1 file changed, 91 insertions(+), 147 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/8e1e46e7/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ParDoTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ParDoTest.java 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ParDoTest.java
index 593f304..9755076 100644
--- a/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ParDoTest.java
+++ b/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ParDoTest.java
@@ -111,74 +111,9 @@ public class ParDoTest implements Serializable {
   + ":" + window.maxTimestamp().getMillis());
 }
   }
-
-  static class TestOldDoFn extends OldDoFn {
-enum State { UNSTARTED, STARTED, PROCESSING, FINISHED }
-State state = State.UNSTARTED;
-
-final List sideInputViews = new ArrayList<>();
-final List sideOutputTupleTags = new ArrayList<>();
-
-public TestOldDoFn() {
-}
-
-public TestOldDoFn(List sideInputViews,
-List sideOutputTupleTags) {
-  this.sideInputViews.addAll(sideInputViews);
-  this.sideOutputTupleTags.addAll(sideOutputTupleTags);
-}
-
-@Override
-public void startBundle(Context c) {
-  // The Fn can be reused, but only if FinishBundle has been called.
-  assertThat(state, anyOf(equalTo(State.UNSTARTED), 
equalTo(State.FINISHED)));
-  state = State.STARTED;
-  outputToAll(c, "started");
-}
-
-@Override
-public void processElement(ProcessContext c) {
-  assertThat(state,
- anyOf(equalTo(State.STARTED), equalTo(State.PROCESSING)));
-  state = State.PROCESSING;
-  outputToAllWithSideInputs(c, "processing: " + c.element());
-}
-
-@Override
-public void finishBundle(Context c) {
-  assertThat(state,
- anyOf(equalTo(State.STARTED), equalTo(State.PROCESSING)));
-  state = State.FINISHED;
-  outputToAll(c, "finished");
-}
-
-private void outputToAll(Context c, String value) {
-  c.output(value);
-  for (TupleTag sideOutputTupleTag : sideOutputTupleTags) {
-c.sideOutput(sideOutputTupleTag,
- sideOutputTupleTag.getId() + ": " + value);
-  }
-}
-
-private void outputToAllWithSideInputs(ProcessContext c, String value) {
-  if (!sideInputViews.isEmpty()) {
-List sideInputValues = new ArrayList<>();
-for (PCollectionView sideInputView : sideInputViews) {
-  sideInputValues.add(c.sideInput(sideInputView));
-}
-value += ": " + sideInputValues;
-  }
-  c.output(value);
-  for (TupleTag sideOutputTupleTag : sideOutputTupleTags) {
-c.sideOutput(sideOutputTupleTag,
- sideOutputTupleTag.getId() + ": " + value);
-  }
-}
-  }
-
-  static class TestNoOutputDoFn extends OldDoFn {
-@Override
-public void processElement(OldDoFn.ProcessContext c) 
throws Exception {}
+  static class TestNoOutputDoFn extends DoFn {
+@ProcessElement
+public void processElement(DoFn.ProcessContext c) throws 
Exception {}
   }
 
   static class TestDoFn extends DoFn {
@@ -254,52 +189,52 @@ public class ParDoTest implements Serializable {
 }
   }
 
-  static class TestStartBatchErrorDoFn extends OldDoFn {
-@Override
+  static class TestStartBatchErrorDoFn extends DoFn {
+@StartBundle
 public void startBundle(Context c) {
   throw new RuntimeException("test error in initialize");
 }
 
-@Override
+@ProcessElement
 public void processElement(ProcessContext c) {
   // This has to be here.
 }
   }
 
-  static class TestProcessElementErrorDoFn extends OldDoFn {
-@Override
+  static class TestProcessElementErrorDoFn extends DoFn

[1/2] incubator-beam git commit: This closes #1527

2016-12-07 Thread tgroh
Repository: incubator-beam
Updated Branches:
  refs/heads/master 02bb8c375 -> 1526184ae


This closes #1527


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/1526184a
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/1526184a
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/1526184a

Branch: refs/heads/master
Commit: 1526184ae8be1f8ae6863f830653204157a584cd
Parents: 02bb8c3 b2d7223
Author: Thomas Groh 
Authored: Wed Dec 7 08:51:02 2016 -0800
Committer: Thomas Groh 
Committed: Wed Dec 7 08:51:02 2016 -0800

--
 .../java/org/apache/beam/runners/core/DoFnRunner.java | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)
--




[2/2] incubator-beam git commit: Port most of DoFnRunner Javadoc to new DoFn

2016-12-07 Thread tgroh
Port most of DoFnRunner Javadoc to new DoFn


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/b2d72237
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/b2d72237
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/b2d72237

Branch: refs/heads/master
Commit: b2d72237b592e1dcb5cca30f5cbc9a11d2890c0f
Parents: 02bb8c3
Author: Kenneth Knowles 
Authored: Tue Dec 6 15:20:28 2016 -0800
Committer: Thomas Groh 
Committed: Wed Dec 7 08:51:02 2016 -0800

--
 .../java/org/apache/beam/runners/core/DoFnRunner.java | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/b2d72237/runners/core-java/src/main/java/org/apache/beam/runners/core/DoFnRunner.java
--
diff --git 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/DoFnRunner.java 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/DoFnRunner.java
index aac8e8f..501667e 100644
--- 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/DoFnRunner.java
+++ 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/DoFnRunner.java
@@ -18,29 +18,29 @@
 package org.apache.beam.runners.core;
 
 import org.apache.beam.sdk.transforms.Aggregator;
+import org.apache.beam.sdk.transforms.DoFn;
 import org.apache.beam.sdk.transforms.OldDoFn;
-import org.apache.beam.sdk.transforms.OldDoFn.ProcessContext;
 import org.apache.beam.sdk.util.WindowedValue;
 import org.apache.beam.sdk.values.KV;
 
 /**
- * An wrapper interface that represents the execution of a {@link OldDoFn}.
+ * An wrapper interface that represents the execution of a {@link DoFn}.
  */
 public interface DoFnRunner {
   /**
-   * Prepares and calls {@link OldDoFn#startBundle}.
+   * Prepares and calls a {@link DoFn DoFn's} {@link DoFn.StartBundle 
@StartBundle} method.
*/
   void startBundle();
 
   /**
-   * Calls {@link OldDoFn#processElement} with a {@link ProcessContext} 
containing the current
-   * element.
+   * Calls a {@link DoFn DoFn's} {@link DoFn.ProcessElement @ProcessElement} 
method with a
+   * {@link DoFn.ProcessContext} containing the provided element.
*/
   void processElement(WindowedValue elem);
 
   /**
-   * Calls {@link OldDoFn#finishBundle} and performs additional tasks, such as
-   * flushing in-memory states.
+   * Calls a {@link DoFn DoFn's} {@link DoFn.FinishBundle @FinishBundle} 
method and performs
+   * additional tasks, such as flushing in-memory states.
*/
   void finishBundle();
 



  1   2   >