Jenkins build is back to stable : beam_PostCommit_RunnableOnService_GoogleCloudDataflow #226

2016-04-26 Thread Apache Jenkins Server
See 




Re: Jenkins build is still unstable: beam_PostCommit_RunnableOnService_GoogleCloudDataflow #225

2016-04-26 Thread Dan Halperin
This should be fixed now by a configuration change -- we recently made a
package rename, and the config of this Jenkins build was not updated.

On Tue, Apr 26, 2016 at 6:32 PM, Dan Halperin  wrote:

> Looking
>
> On Tue, Apr 26, 2016 at 6:21 PM, Apache Jenkins Server <
> jenk...@builds.apache.org> wrote:
>
>> See <
>> https://builds.apache.org/job/beam_PostCommit_RunnableOnService_GoogleCloudDataflow/changes
>> >
>>
>>
>


Re: Jenkins build is still unstable: beam_PostCommit_RunnableOnService_GoogleCloudDataflow #225

2016-04-26 Thread Dan Halperin
Looking

On Tue, Apr 26, 2016 at 6:21 PM, Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> See <
> https://builds.apache.org/job/beam_PostCommit_RunnableOnService_GoogleCloudDataflow/changes
> >
>
>


Jenkins build is still unstable: beam_PostCommit_RunnableOnService_GoogleCloudDataflow #225

2016-04-26 Thread Apache Jenkins Server
See 




Jenkins build became unstable: beam_PostCommit_RunnableOnService_GoogleCloudDataflow #224

2016-04-26 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-77) Reorganize Directory structure

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-77?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15259301#comment-15259301
 ] 

ASF GitHub Bot commented on BEAM-77:


Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/239


> Reorganize Directory structure
> --
>
> Key: BEAM-77
> URL: https://issues.apache.org/jira/browse/BEAM-77
> Project: Beam
>  Issue Type: Task
>  Components: project-management
>Reporter: Frances Perry
>Assignee: Jean-Baptiste Onofré
>
> Now that we've done the initial Dataflow code drop, we will restructure 
> directories to provide space for additional SDKs and Runners.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[20/21] incubator-beam git commit: Fix a few underlying checkstyle issues in java8 examples

2016-04-26 Thread dhalperi
Fix a few underlying checkstyle issues in java8 examples


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/9e19efdf
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/9e19efdf
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/9e19efdf

Branch: refs/heads/master
Commit: 9e19efdf37541ee082bed1ebfd6dd0b154de5f0a
Parents: c08f973
Author: Davor Bonaci 
Authored: Mon Apr 25 15:02:27 2016 -0700
Committer: Davor Bonaci 
Committed: Tue Apr 26 17:59:39 2016 -0700

--
 .../apache/beam/examples/complete/game/GameStats.java   | 12 
 .../beam/examples/complete/game/HourlyTeamScore.java|  6 --
 .../apache/beam/examples/complete/game/LeaderBoard.java |  6 --
 3 files changed, 16 insertions(+), 8 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/9e19efdf/examples/java8/src/main/java/org/apache/beam/examples/complete/game/GameStats.java
--
diff --git 
a/examples/java8/src/main/java/org/apache/beam/examples/complete/game/GameStats.java
 
b/examples/java8/src/main/java/org/apache/beam/examples/complete/game/GameStats.java
index d93c2ae..2d14264 100644
--- 
a/examples/java8/src/main/java/org/apache/beam/examples/complete/game/GameStats.java
+++ 
b/examples/java8/src/main/java/org/apache/beam/examples/complete/game/GameStats.java
@@ -207,8 +207,10 @@ public class GameStats extends LeaderBoard {
 c -> c.element().getValue()));
 tableConfigure.put("window_start",
 new WriteWindowedToBigQuery.FieldInfo>("STRING",
-  c -> { IntervalWindow w = (IntervalWindow) c.window();
- return fmt.print(w.start()); }));
+  c -> {
+IntervalWindow w = (IntervalWindow) c.window();
+return fmt.print(w.start());
+  }));
 tableConfigure.put("processing_time",
 new WriteWindowedToBigQuery.FieldInfo>(
 "STRING", c -> fmt.print(Instant.now(;
@@ -226,8 +228,10 @@ public class GameStats extends LeaderBoard {
 new HashMap>();
 tableConfigure.put("window_start",
 new WriteWindowedToBigQuery.FieldInfo("STRING",
-  c -> { IntervalWindow w = (IntervalWindow) c.window();
- return fmt.print(w.start()); }));
+  c -> {
+IntervalWindow w = (IntervalWindow) c.window();
+return fmt.print(w.start());
+  }));
 tableConfigure.put("mean_duration",
 new WriteWindowedToBigQuery.FieldInfo("FLOAT", c -> 
c.element()));
 return tableConfigure;

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/9e19efdf/examples/java8/src/main/java/org/apache/beam/examples/complete/game/HourlyTeamScore.java
--
diff --git 
a/examples/java8/src/main/java/org/apache/beam/examples/complete/game/HourlyTeamScore.java
 
b/examples/java8/src/main/java/org/apache/beam/examples/complete/game/HourlyTeamScore.java
index 5ce7d95..b516a32 100644
--- 
a/examples/java8/src/main/java/org/apache/beam/examples/complete/game/HourlyTeamScore.java
+++ 
b/examples/java8/src/main/java/org/apache/beam/examples/complete/game/HourlyTeamScore.java
@@ -132,8 +132,10 @@ public class HourlyTeamScore extends UserScore {
 c -> c.element().getValue()));
 tableConfig.put("window_start",
 new WriteWindowedToBigQuery.FieldInfo>("STRING",
-  c -> { IntervalWindow w = (IntervalWindow) c.window();
- return fmt.print(w.start()); }));
+  c -> {
+IntervalWindow w = (IntervalWindow) c.window();
+return fmt.print(w.start());
+  }));
 return tableConfig;
   }
 

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/9e19efdf/examples/java8/src/main/java/org/apache/beam/examples/complete/game/LeaderBoard.java
--
diff --git 
a/examples/java8/src/main/java/org/apache/beam/examples/complete/game/LeaderBoard.java
 
b/examples/java8/src/main/java/org/apache/beam/examples/complete/game/LeaderBoard.java
index 594d2b8..97958b0 100644
--- 
a/examples/java8/src/main/java/org/apache/beam/examples/complete/game/LeaderBoard.java
+++ 
b/examples/java8/src/main/java/org/apache/beam/examples/complete/game/LeaderBoard.java
@@ -143,8 +143,10 @@ public class LeaderBoard extends HourlyTeamScore {
 c -> c.element().getValue()));
 tableConfigure.put("window_start",
 new WriteWindowedToBigQuery.FieldInfo>("STRING",
-  c -> { IntervalWindow w = (IntervalWindow) c.window();
- return fmt.print(w.start()); }));
+  c -> {
+IntervalWindow w = (IntervalWindow) c.window();
+retu

[01/21] incubator-beam git commit: Update Dataflow worker harness container image to match package changes in this pull request

2016-04-26 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/master 5e3d7ad20 -> e3105c8e1


Update Dataflow worker harness container image to match package changes
in this pull request


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/0fafd4e8
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/0fafd4e8
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/0fafd4e8

Branch: refs/heads/master
Commit: 0fafd4e89f387b73c33aac65d42a6a367e9dd738
Parents: 0219098
Author: Davor Bonaci 
Authored: Tue Apr 26 16:38:06 2016 -0700
Committer: Davor Bonaci 
Committed: Tue Apr 26 17:59:39 2016 -0700

--
 .../org/apache/beam/runners/dataflow/DataflowPipelineRunner.java | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/0fafd4e8/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineRunner.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineRunner.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineRunner.java
index 0fc095a..ec4a60c 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineRunner.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineRunner.java
@@ -217,9 +217,9 @@ public class DataflowPipelineRunner extends 
PipelineRunner
   // Default Docker container images that execute Dataflow worker harness, 
residing in Google
   // Container Registry, separately for Batch and Streaming.
   public static final String BATCH_WORKER_HARNESS_CONTAINER_IMAGE
-  = "dataflow.gcr.io/v1beta3/beam-java-batch:beam-master-20160422";
+  = "dataflow.gcr.io/v1beta3/beam-java-batch:beam-master-20160426";
   public static final String STREAMING_WORKER_HARNESS_CONTAINER_IMAGE
-  = "dataflow.gcr.io/v1beta3/beam-java-streaming:beam-master-20160422";
+  = "dataflow.gcr.io/v1beta3/beam-java-streaming:beam-master-20160426";
 
   // The limit of CreateJob request size.
   private static final int CREATE_JOB_REQUEST_LIMIT_BYTES = 10 * 1024 * 1024;



[07/21] incubator-beam git commit: Reorganize Java packages in the sources of the Google Cloud Dataflow runner

2016-04-26 Thread dhalperi
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/02190985/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslatorTest.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslatorTest.java
 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslatorTest.java
new file mode 100644
index 000..3a39e41
--- /dev/null
+++ 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslatorTest.java
@@ -0,0 +1,967 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.dataflow;
+
+import static org.apache.beam.sdk.util.Structs.addObject;
+import static org.apache.beam.sdk.util.Structs.getDictionary;
+import static org.apache.beam.sdk.util.Structs.getString;
+
+import static org.hamcrest.Matchers.containsString;
+import static org.hamcrest.Matchers.hasKey;
+import static org.hamcrest.core.IsInstanceOf.instanceOf;
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertNull;
+import static org.junit.Assert.assertThat;
+import static org.junit.Assert.assertTrue;
+import static org.mockito.Matchers.any;
+import static org.mockito.Matchers.anyString;
+import static org.mockito.Matchers.argThat;
+import static org.mockito.Matchers.eq;
+import static org.mockito.Mockito.mock;
+import static org.mockito.Mockito.when;
+
+import 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator.TranslationContext;
+import org.apache.beam.runners.dataflow.options.DataflowPipelineOptions;
+import 
org.apache.beam.runners.dataflow.options.DataflowPipelineWorkerPoolOptions;
+import org.apache.beam.sdk.Pipeline;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.StringUtf8Coder;
+import org.apache.beam.sdk.coders.VarIntCoder;
+import org.apache.beam.sdk.coders.VoidCoder;
+import org.apache.beam.sdk.io.TextIO;
+import org.apache.beam.sdk.options.PipelineOptionsFactory;
+import org.apache.beam.sdk.runners.RecordingPipelineVisitor;
+import org.apache.beam.sdk.transforms.Count;
+import org.apache.beam.sdk.transforms.Create;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.Sum;
+import org.apache.beam.sdk.transforms.View;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.util.GcsUtil;
+import org.apache.beam.sdk.util.OutputReference;
+import org.apache.beam.sdk.util.PropertyNames;
+import org.apache.beam.sdk.util.Structs;
+import org.apache.beam.sdk.util.TestCredential;
+import org.apache.beam.sdk.util.WindowingStrategy;
+import org.apache.beam.sdk.util.gcsfs.GcsPath;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionTuple;
+import org.apache.beam.sdk.values.PDone;
+import org.apache.beam.sdk.values.TupleTag;
+
+import com.google.api.services.dataflow.Dataflow;
+import com.google.api.services.dataflow.model.DataflowPackage;
+import com.google.api.services.dataflow.model.Job;
+import com.google.api.services.dataflow.model.Step;
+import com.google.api.services.dataflow.model.WorkerPool;
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import com.google.common.collect.ImmutableSet;
+import com.google.common.collect.Iterables;
+
+import org.hamcrest.Matchers;
+import org.junit.Assert;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.internal.matchers.ThrowableMessageMatcher;
+import org.junit.rules.ExpectedException;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+import org.mockito.ArgumentMatcher;
+import org.mockito.invocation.InvocationOnMock;
+import org.mockito.stubbing.Answer;
+
+import java.io.IOException;
+import java.io.Serializable;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.Map;
+
+/**
+

[08/21] incubator-beam git commit: Reorganize Java packages in the sources of the Google Cloud Dataflow runner

2016-04-26 Thread dhalperi
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/02190985/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowPipelineRegistrarTest.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowPipelineRegistrarTest.java
 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowPipelineRegistrarTest.java
new file mode 100644
index 000..cf9a95a
--- /dev/null
+++ 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowPipelineRegistrarTest.java
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.dataflow;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.fail;
+
+import 
org.apache.beam.runners.dataflow.options.BlockingDataflowPipelineOptions;
+import org.apache.beam.runners.dataflow.options.DataflowPipelineOptions;
+import org.apache.beam.sdk.options.PipelineOptionsRegistrar;
+import org.apache.beam.sdk.runners.PipelineRunnerRegistrar;
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.Lists;
+
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+import java.util.ServiceLoader;
+
+/** Tests for {@link DataflowPipelineRegistrar}. */
+@RunWith(JUnit4.class)
+public class DataflowPipelineRegistrarTest {
+  @Test
+  public void testCorrectOptionsAreReturned() {
+assertEquals(ImmutableList.of(DataflowPipelineOptions.class,
+  BlockingDataflowPipelineOptions.class),
+new DataflowPipelineRegistrar.Options().getPipelineOptions());
+  }
+
+  @Test
+  public void testCorrectRunnersAreReturned() {
+assertEquals(ImmutableList.of(DataflowPipelineRunner.class,
+  BlockingDataflowPipelineRunner.class),
+new DataflowPipelineRegistrar.Runner().getPipelineRunners());
+  }
+
+  @Test
+  public void testServiceLoaderForOptions() {
+for (PipelineOptionsRegistrar registrar :
+
Lists.newArrayList(ServiceLoader.load(PipelineOptionsRegistrar.class).iterator()))
 {
+  if (registrar instanceof DataflowPipelineRegistrar.Options) {
+return;
+  }
+}
+fail("Expected to find " + DataflowPipelineRegistrar.Options.class);
+  }
+
+  @Test
+  public void testServiceLoaderForRunner() {
+for (PipelineRunnerRegistrar registrar :
+
Lists.newArrayList(ServiceLoader.load(PipelineRunnerRegistrar.class).iterator()))
 {
+  if (registrar instanceof DataflowPipelineRegistrar.Runner) {
+return;
+  }
+}
+fail("Expected to find " + DataflowPipelineRegistrar.Runner.class);
+  }
+}

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/02190985/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowPipelineRunnerTest.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowPipelineRunnerTest.java
 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowPipelineRunnerTest.java
new file mode 100644
index 000..79e281e
--- /dev/null
+++ 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowPipelineRunnerTest.java
@@ -0,0 +1,1401 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or

[11/21] incubator-beam git commit: Reorganize Java packages in the sources of the Google Cloud Dataflow runner

2016-04-26 Thread dhalperi
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/02190985/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/sdk/runners/DataflowPipelineRunner.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/sdk/runners/DataflowPipelineRunner.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/sdk/runners/DataflowPipelineRunner.java
deleted file mode 100644
index d2f8bbe..000
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/sdk/runners/DataflowPipelineRunner.java
+++ /dev/null
@@ -1,3022 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.runners;
-
-import static org.apache.beam.sdk.util.StringUtils.approximatePTransformName;
-import static org.apache.beam.sdk.util.StringUtils.approximateSimpleName;
-import static org.apache.beam.sdk.util.WindowedValue.valueInEmptyWindows;
-
-import static com.google.common.base.Preconditions.checkArgument;
-import static com.google.common.base.Preconditions.checkState;
-
-import org.apache.beam.sdk.Pipeline;
-import org.apache.beam.sdk.Pipeline.PipelineVisitor;
-import org.apache.beam.sdk.PipelineResult.State;
-import org.apache.beam.sdk.annotations.Experimental;
-import org.apache.beam.sdk.coders.AvroCoder;
-import org.apache.beam.sdk.coders.BigEndianLongCoder;
-import org.apache.beam.sdk.coders.CannotProvideCoderException;
-import org.apache.beam.sdk.coders.Coder;
-import org.apache.beam.sdk.coders.Coder.NonDeterministicException;
-import org.apache.beam.sdk.coders.CoderException;
-import org.apache.beam.sdk.coders.CoderRegistry;
-import org.apache.beam.sdk.coders.IterableCoder;
-import org.apache.beam.sdk.coders.KvCoder;
-import org.apache.beam.sdk.coders.ListCoder;
-import org.apache.beam.sdk.coders.MapCoder;
-import org.apache.beam.sdk.coders.SerializableCoder;
-import org.apache.beam.sdk.coders.StandardCoder;
-import org.apache.beam.sdk.coders.VarIntCoder;
-import org.apache.beam.sdk.coders.VarLongCoder;
-import org.apache.beam.sdk.io.AvroIO;
-import org.apache.beam.sdk.io.BigQueryIO;
-import org.apache.beam.sdk.io.FileBasedSink;
-import org.apache.beam.sdk.io.PubsubIO;
-import org.apache.beam.sdk.io.Read;
-import org.apache.beam.sdk.io.ShardNameTemplate;
-import org.apache.beam.sdk.io.TextIO;
-import org.apache.beam.sdk.io.UnboundedSource;
-import org.apache.beam.sdk.io.Write;
-import org.apache.beam.sdk.options.DataflowPipelineDebugOptions;
-import org.apache.beam.sdk.options.DataflowPipelineOptions;
-import org.apache.beam.sdk.options.DataflowPipelineWorkerPoolOptions;
-import org.apache.beam.sdk.options.PipelineOptions;
-import org.apache.beam.sdk.options.PipelineOptionsValidator;
-import org.apache.beam.sdk.options.StreamingOptions;
-import org.apache.beam.sdk.runners.DataflowPipelineTranslator.JobSpecification;
-import 
org.apache.beam.sdk.runners.DataflowPipelineTranslator.TransformTranslator;
-import 
org.apache.beam.sdk.runners.DataflowPipelineTranslator.TranslationContext;
-import org.apache.beam.sdk.runners.dataflow.AssignWindows;
-import org.apache.beam.sdk.runners.dataflow.DataflowAggregatorTransforms;
-import org.apache.beam.sdk.runners.dataflow.PubsubIOTranslator;
-import org.apache.beam.sdk.runners.dataflow.ReadTranslator;
-import org.apache.beam.sdk.runners.worker.IsmFormat;
-import org.apache.beam.sdk.runners.worker.IsmFormat.IsmRecord;
-import org.apache.beam.sdk.runners.worker.IsmFormat.IsmRecordCoder;
-import org.apache.beam.sdk.runners.worker.IsmFormat.MetadataKeyCoder;
-import org.apache.beam.sdk.transforms.Aggregator;
-import org.apache.beam.sdk.transforms.Combine;
-import org.apache.beam.sdk.transforms.Combine.CombineFn;
-import org.apache.beam.sdk.transforms.Create;
-import org.apache.beam.sdk.transforms.DoFn;
-import org.apache.beam.sdk.transforms.Flatten;
-import org.apache.beam.sdk.transforms.GroupByKey;
-import org.apache.beam.sdk.transforms.PTransform;
-import org.apache.beam.sdk.transforms.ParDo;
-import org.apache.beam.sdk.transforms.SerializableFunction;
-import org.apache.beam.sdk.transforms.View;
-import org.apache.beam.sdk.transforms.View.CreatePCollectionView;
-import org.ap

[17/21] incubator-beam git commit: Reorganize Java packages in the sources of the Google Cloud Dataflow runner

2016-04-26 Thread dhalperi
Reorganize Java packages in the sources of the Google Cloud Dataflow runner

Packages are moving from org.apache.beam.sdk to 
org.apache.beam.runners.dataflow.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/02190985
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/02190985
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/02190985

Branch: refs/heads/master
Commit: 021909855fcf6729ce6ccb9b9ff76f1ca5af35db
Parents: 9e19efd
Author: Davor Bonaci 
Authored: Mon Apr 25 14:16:03 2016 -0700
Committer: Davor Bonaci 
Committed: Tue Apr 26 17:59:39 2016 -0700

--
 .../apache/beam/examples/MinimalWordCount.java  |4 +-
 .../org/apache/beam/examples/WordCount.java |2 +-
 .../examples/common/DataflowExampleOptions.java |2 +-
 .../examples/common/DataflowExampleUtils.java   |8 +-
 .../common/ExampleBigQueryTableOptions.java |2 +-
 ...xamplePubsubTopicAndSubscriptionOptions.java |2 +-
 .../common/ExamplePubsubTopicOptions.java   |2 +-
 .../examples/common/PubsubFileInjector.java |2 +-
 .../beam/examples/complete/AutoComplete.java|2 +-
 .../examples/complete/StreamingWordExtract.java |2 +-
 .../examples/complete/TopWikipediaSessions.java |2 +-
 .../beam/examples/cookbook/DeDupExample.java|2 +-
 .../beam/examples/cookbook/TriggerExample.java  |4 +-
 .../org/apache/beam/examples/WordCountIT.java   |4 +-
 .../beam/examples/MinimalWordCountJava8.java|4 +-
 .../beam/examples/complete/game/GameStats.java  |2 +-
 .../examples/complete/game/LeaderBoard.java |2 +-
 .../beam/runners/flink/examples/TFIDF.java  |1 +
 .../beam/runners/flink/examples/WordCount.java  |9 +-
 .../flink/examples/streaming/AutoComplete.java  |   21 +-
 .../flink/examples/streaming/JoinExamples.java  |6 +-
 .../KafkaWindowedWordCountExample.java  |   11 +-
 .../examples/streaming/WindowedWordCount.java   |   13 +-
 .../runners/flink/FlinkPipelineOptions.java |2 +-
 .../beam/runners/flink/FlinkPipelineRunner.java |4 +-
 .../FlinkBatchPipelineTranslator.java   |2 +-
 .../FlinkStreamingPipelineTranslator.java   |2 +-
 runners/google-cloud-dataflow-java/pom.xml  |4 +-
 .../BlockingDataflowPipelineRunner.java |  186 ++
 .../DataflowJobAlreadyExistsException.java  |   35 +
 .../DataflowJobAlreadyUpdatedException.java |   34 +
 .../dataflow/DataflowJobCancelledException.java |   39 +
 .../runners/dataflow/DataflowJobException.java  |   41 +
 .../dataflow/DataflowJobExecutionException.java |   35 +
 .../dataflow/DataflowJobUpdatedException.java   |   51 +
 .../runners/dataflow/DataflowPipelineJob.java   |  397 +++
 .../dataflow/DataflowPipelineRegistrar.java |   62 +
 .../dataflow/DataflowPipelineRunner.java| 3025 ++
 .../dataflow/DataflowPipelineRunnerHooks.java   |   39 +
 .../dataflow/DataflowPipelineTranslator.java| 1059 ++
 .../dataflow/DataflowServiceException.java  |   33 +
 .../dataflow/internal/AssignWindows.java|   89 +
 .../dataflow/internal/BigQueryIOTranslator.java |   72 +
 .../dataflow/internal/CustomSources.java|  121 +
 .../internal/DataflowAggregatorTransforms.java  |   81 +
 .../internal/DataflowMetricUpdateExtractor.java |  111 +
 .../dataflow/internal/PubsubIOTranslator.java   |  108 +
 .../dataflow/internal/ReadTranslator.java   |  105 +
 .../runners/dataflow/internal/package-info.java |   21 +
 .../BlockingDataflowPipelineOptions.java|   55 +
 .../dataflow/options/CloudDebuggerOptions.java  |   56 +
 .../options/DataflowPipelineDebugOptions.java   |  247 ++
 .../options/DataflowPipelineOptions.java|  126 +
 .../DataflowPipelineWorkerPoolOptions.java  |  263 ++
 .../options/DataflowProfilingOptions.java   |   50 +
 .../options/DataflowWorkerHarnessOptions.java   |   55 +
 .../options/DataflowWorkerLoggingOptions.java   |  159 +
 .../testing/TestDataflowPipelineOptions.java|   30 +
 .../testing/TestDataflowPipelineRunner.java |  273 ++
 .../dataflow/util/DataflowPathValidator.java|  100 +
 .../dataflow/util/DataflowTransport.java|  114 +
 .../beam/runners/dataflow/util/GcsStager.java   |   55 +
 .../runners/dataflow/util/MonitoringUtil.java   |  237 ++
 .../beam/runners/dataflow/util/PackageUtil.java |  333 ++
 .../beam/runners/dataflow/util/Stager.java  |   30 +
 .../BlockingDataflowPipelineOptions.java|   50 -
 .../beam/sdk/options/CloudDebuggerOptions.java  |   53 -
 .../options/DataflowPipelineDebugOptions.java   |  242 --
 .../sdk/options/DataflowPipelineOptions.java|  115 -
 .../DataflowPipelineWorkerPoolOptions.java  |  258 --
 .../sdk/options/DataflowProfilingOptions.java   |   48 -
 .../options/DataflowWorkerHarnessOptions.java   |   51 -
 .../options/

[16/21] incubator-beam git commit: Reorganize Java packages in the sources of the Google Cloud Dataflow runner

2016-04-26 Thread dhalperi
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/02190985/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineRunner.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineRunner.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineRunner.java
new file mode 100644
index 000..0fc095a
--- /dev/null
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineRunner.java
@@ -0,0 +1,3025 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.dataflow;
+
+import static org.apache.beam.sdk.util.StringUtils.approximatePTransformName;
+import static org.apache.beam.sdk.util.StringUtils.approximateSimpleName;
+import static org.apache.beam.sdk.util.WindowedValue.valueInEmptyWindows;
+
+import static com.google.common.base.Preconditions.checkArgument;
+import static com.google.common.base.Preconditions.checkState;
+
+import 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator.JobSpecification;
+import 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator.TransformTranslator;
+import 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator.TranslationContext;
+import org.apache.beam.runners.dataflow.internal.AssignWindows;
+import org.apache.beam.runners.dataflow.internal.DataflowAggregatorTransforms;
+import org.apache.beam.runners.dataflow.internal.PubsubIOTranslator;
+import org.apache.beam.runners.dataflow.internal.ReadTranslator;
+import org.apache.beam.runners.dataflow.options.DataflowPipelineDebugOptions;
+import org.apache.beam.runners.dataflow.options.DataflowPipelineOptions;
+import 
org.apache.beam.runners.dataflow.options.DataflowPipelineWorkerPoolOptions;
+import org.apache.beam.runners.dataflow.util.DataflowTransport;
+import org.apache.beam.runners.dataflow.util.MonitoringUtil;
+import org.apache.beam.sdk.Pipeline;
+import org.apache.beam.sdk.Pipeline.PipelineVisitor;
+import org.apache.beam.sdk.PipelineResult.State;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.AvroCoder;
+import org.apache.beam.sdk.coders.BigEndianLongCoder;
+import org.apache.beam.sdk.coders.CannotProvideCoderException;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.Coder.NonDeterministicException;
+import org.apache.beam.sdk.coders.CoderException;
+import org.apache.beam.sdk.coders.CoderRegistry;
+import org.apache.beam.sdk.coders.IterableCoder;
+import org.apache.beam.sdk.coders.KvCoder;
+import org.apache.beam.sdk.coders.ListCoder;
+import org.apache.beam.sdk.coders.MapCoder;
+import org.apache.beam.sdk.coders.SerializableCoder;
+import org.apache.beam.sdk.coders.StandardCoder;
+import org.apache.beam.sdk.coders.VarIntCoder;
+import org.apache.beam.sdk.coders.VarLongCoder;
+import org.apache.beam.sdk.io.AvroIO;
+import org.apache.beam.sdk.io.BigQueryIO;
+import org.apache.beam.sdk.io.FileBasedSink;
+import org.apache.beam.sdk.io.PubsubIO;
+import org.apache.beam.sdk.io.Read;
+import org.apache.beam.sdk.io.ShardNameTemplate;
+import org.apache.beam.sdk.io.TextIO;
+import org.apache.beam.sdk.io.UnboundedSource;
+import org.apache.beam.sdk.io.Write;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.PipelineOptionsValidator;
+import org.apache.beam.sdk.options.StreamingOptions;
+import org.apache.beam.sdk.runners.AggregatorPipelineExtractor;
+import org.apache.beam.sdk.runners.PipelineRunner;
+import org.apache.beam.sdk.runners.TransformTreeNode;
+import org.apache.beam.sdk.runners.worker.IsmFormat;
+import org.apache.beam.sdk.runners.worker.IsmFormat.IsmRecord;
+import org.apache.beam.sdk.runners.worker.IsmFormat.IsmRecordCoder;
+import org.apache.beam.sdk.runners.worker.IsmFormat.MetadataKeyCoder;
+import org.apache.beam.sdk.transforms.Aggregator;
+import org.apache.beam.sdk.transforms.Combine;
+import org.apache.beam.sdk.transforms.Combine.CombineFn;
+import org.apache.beam.sdk.transforms.Create;
+import org.apache.beam.sdk.tran

[06/21] incubator-beam git commit: Reorganize Java packages in the sources of the Google Cloud Dataflow runner

2016-04-26 Thread dhalperi
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/02190985/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/transforms/DataflowGroupByKeyTest.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/transforms/DataflowGroupByKeyTest.java
 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/transforms/DataflowGroupByKeyTest.java
new file mode 100644
index 000..f0e677e
--- /dev/null
+++ 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/transforms/DataflowGroupByKeyTest.java
@@ -0,0 +1,113 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.dataflow.transforms;
+
+import org.apache.beam.runners.dataflow.DataflowPipelineRunner;
+import org.apache.beam.runners.dataflow.options.DataflowPipelineOptions;
+import org.apache.beam.sdk.Pipeline;
+import org.apache.beam.sdk.coders.BigEndianIntegerCoder;
+import org.apache.beam.sdk.coders.KvCoder;
+import org.apache.beam.sdk.coders.StringUtf8Coder;
+import org.apache.beam.sdk.options.PipelineOptionsFactory;
+import org.apache.beam.sdk.transforms.Create;
+import org.apache.beam.sdk.transforms.GroupByKey;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.windowing.Sessions;
+import org.apache.beam.sdk.transforms.windowing.Window;
+import org.apache.beam.sdk.util.NoopPathValidator;
+import org.apache.beam.sdk.util.WindowingStrategy;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.TypeDescriptor;
+
+import org.joda.time.Duration;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.ExpectedException;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+import java.util.Arrays;
+import java.util.List;
+
+/** Tests for {@link GroupByKey} for the {@link DataflowPipelineRunner}. */
+@RunWith(JUnit4.class)
+public class DataflowGroupByKeyTest {
+  @Rule
+  public ExpectedException thrown = ExpectedException.none();
+
+  /**
+   * Create a test pipeline that uses the {@link DataflowPipelineRunner} so 
that {@link GroupByKey}
+   * is not expanded. This is used for verifying that even without expansion 
the proper errors show
+   * up.
+   */
+  private Pipeline createTestServiceRunner() {
+DataflowPipelineOptions options = 
PipelineOptionsFactory.as(DataflowPipelineOptions.class);
+options.setRunner(DataflowPipelineRunner.class);
+options.setProject("someproject");
+options.setStagingLocation("gs://staging");
+options.setPathValidatorClass(NoopPathValidator.class);
+options.setDataflowClient(null);
+return Pipeline.create(options);
+  }
+
+  @Test
+  public void testInvalidWindowsService() {
+Pipeline p = createTestServiceRunner();
+
+List> ungroupedPairs = Arrays.asList();
+
+PCollection> input =
+p.apply(Create.of(ungroupedPairs)
+.withCoder(KvCoder.of(StringUtf8Coder.of(), 
BigEndianIntegerCoder.of(
+.apply(Window.>into(
+Sessions.withGapDuration(Duration.standardMinutes(1;
+
+thrown.expect(IllegalStateException.class);
+thrown.expectMessage("GroupByKey must have a valid Window merge function");
+input
+.apply("GroupByKey", GroupByKey.create())
+.apply("GroupByKeyAgain", GroupByKey.>create());
+  }
+
+  @Test
+  public void testGroupByKeyServiceUnbounded() {
+Pipeline p = createTestServiceRunner();
+
+PCollection> input =
+p.apply(
+new PTransform>>() {
+  @Override
+  public PCollection> apply(PBegin input) {
+return PCollection.>createPrimitiveOutputInternal(
+input.getPipeline(),
+WindowingStrategy.globalDefault(),
+PCollection.IsBounded.UNBOUNDED)
+.setTypeDescriptorInternal(new TypeDescriptor>() {});
+  }
+});
+
+thrown.expect(IllegalStateException.class);
+thrown

[05/21] incubator-beam git commit: Reorganize Java packages in the sources of the Google Cloud Dataflow runner

2016-04-26 Thread dhalperi
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/02190985/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/sdk/runners/DataflowPipelineJobTest.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/sdk/runners/DataflowPipelineJobTest.java
 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/sdk/runners/DataflowPipelineJobTest.java
deleted file mode 100644
index d496f38..000
--- 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/sdk/runners/DataflowPipelineJobTest.java
+++ /dev/null
@@ -1,606 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.runners;
-
-import static org.hamcrest.MatcherAssert.assertThat;
-import static org.hamcrest.Matchers.allOf;
-import static org.hamcrest.Matchers.contains;
-import static org.hamcrest.Matchers.containsInAnyOrder;
-import static org.hamcrest.Matchers.empty;
-import static org.hamcrest.Matchers.equalTo;
-import static org.hamcrest.Matchers.greaterThanOrEqualTo;
-import static org.hamcrest.Matchers.hasEntry;
-import static org.hamcrest.Matchers.is;
-import static org.hamcrest.Matchers.lessThanOrEqualTo;
-import static org.junit.Assert.assertEquals;
-import static org.mockito.Matchers.eq;
-import static org.mockito.Mockito.mock;
-import static org.mockito.Mockito.when;
-
-import org.apache.beam.sdk.PipelineResult.State;
-import org.apache.beam.sdk.runners.dataflow.DataflowAggregatorTransforms;
-import org.apache.beam.sdk.testing.FastNanoClockAndSleeper;
-import org.apache.beam.sdk.transforms.Aggregator;
-import org.apache.beam.sdk.transforms.AppliedPTransform;
-import org.apache.beam.sdk.transforms.Combine.CombineFn;
-import org.apache.beam.sdk.transforms.PTransform;
-import org.apache.beam.sdk.transforms.Sum;
-import org.apache.beam.sdk.util.AttemptBoundedExponentialBackOff;
-import org.apache.beam.sdk.util.MonitoringUtil;
-import org.apache.beam.sdk.values.PInput;
-import org.apache.beam.sdk.values.POutput;
-
-import com.google.api.services.dataflow.Dataflow;
-import com.google.api.services.dataflow.Dataflow.Projects.Jobs.Get;
-import com.google.api.services.dataflow.Dataflow.Projects.Jobs.GetMetrics;
-import com.google.api.services.dataflow.Dataflow.Projects.Jobs.Messages;
-import com.google.api.services.dataflow.model.Job;
-import com.google.api.services.dataflow.model.JobMetrics;
-import com.google.api.services.dataflow.model.MetricStructuredName;
-import com.google.api.services.dataflow.model.MetricUpdate;
-import com.google.common.collect.ImmutableList;
-import com.google.common.collect.ImmutableMap;
-import com.google.common.collect.ImmutableSetMultimap;
-
-import org.junit.Before;
-import org.junit.Rule;
-import org.junit.Test;
-import org.junit.rules.ExpectedException;
-import org.junit.runner.RunWith;
-import org.junit.runners.JUnit4;
-import org.mockito.Mock;
-import org.mockito.MockitoAnnotations;
-
-import java.io.IOException;
-import java.math.BigDecimal;
-import java.net.SocketTimeoutException;
-import java.util.concurrent.TimeUnit;
-
-/**
- * Tests for DataflowPipelineJob.
- */
-@RunWith(JUnit4.class)
-public class DataflowPipelineJobTest {
-  private static final String PROJECT_ID = "someProject";
-  private static final String JOB_ID = "1234";
-
-  @Mock
-  private Dataflow mockWorkflowClient;
-  @Mock
-  private Dataflow.Projects mockProjects;
-  @Mock
-  private Dataflow.Projects.Jobs mockJobs;
-  @Rule
-  public FastNanoClockAndSleeper fastClock = new FastNanoClockAndSleeper();
-
-  @Rule
-  public ExpectedException thrown = ExpectedException.none();
-
-  @Before
-  public void setup() {
-MockitoAnnotations.initMocks(this);
-
-when(mockWorkflowClient.projects()).thenReturn(mockProjects);
-when(mockProjects.jobs()).thenReturn(mockJobs);
-  }
-
-  /**
-   * Validates that a given time is valid for the total time slept by a
-   * AttemptBoundedExponentialBackOff given the number of retries and
-   * an initial polling interval.
-   *
-   * @param pollingIntervalMillis The initial polling interval given.
-   * @param attempts The number of attempts made
-   * @param timeSleptMillis The amou

[02/21] incubator-beam git commit: Reorganize Java packages in the sources of the Google Cloud Dataflow runner

2016-04-26 Thread dhalperi
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/02190985/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/sdk/util/PackageUtilTest.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/sdk/util/PackageUtilTest.java
 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/sdk/util/PackageUtilTest.java
deleted file mode 100644
index 21bc60e..000
--- 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/sdk/util/PackageUtilTest.java
+++ /dev/null
@@ -1,484 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.util;
-
-import static org.hamcrest.Matchers.equalTo;
-import static 
org.hamcrest.collection.IsIterableContainingInAnyOrder.containsInAnyOrder;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertNotEquals;
-import static org.junit.Assert.assertNull;
-import static org.junit.Assert.assertThat;
-import static org.junit.Assert.assertTrue;
-import static org.junit.Assert.fail;
-import static org.mockito.Matchers.any;
-import static org.mockito.Matchers.anyString;
-import static org.mockito.Mockito.times;
-import static org.mockito.Mockito.verify;
-import static org.mockito.Mockito.verifyNoMoreInteractions;
-import static org.mockito.Mockito.when;
-
-import org.apache.beam.sdk.options.GcsOptions;
-import org.apache.beam.sdk.options.PipelineOptionsFactory;
-import org.apache.beam.sdk.testing.ExpectedLogs;
-import org.apache.beam.sdk.testing.FastNanoClockAndSleeper;
-import org.apache.beam.sdk.util.PackageUtil.PackageAttributes;
-import org.apache.beam.sdk.util.gcsfs.GcsPath;
-
-import com.google.api.client.googleapis.json.GoogleJsonError.ErrorInfo;
-import com.google.api.client.googleapis.json.GoogleJsonResponseException;
-import com.google.api.client.http.HttpRequest;
-import com.google.api.client.http.HttpResponse;
-import com.google.api.client.http.HttpStatusCodes;
-import com.google.api.client.http.HttpTransport;
-import com.google.api.client.http.LowLevelHttpRequest;
-import com.google.api.client.json.GenericJson;
-import com.google.api.client.json.Json;
-import com.google.api.client.json.JsonFactory;
-import com.google.api.client.json.jackson2.JacksonFactory;
-import com.google.api.client.testing.http.HttpTesting;
-import com.google.api.client.testing.http.MockHttpTransport;
-import com.google.api.client.testing.http.MockLowLevelHttpRequest;
-import com.google.api.client.testing.http.MockLowLevelHttpResponse;
-import com.google.api.services.dataflow.model.DataflowPackage;
-import com.google.common.collect.ImmutableList;
-import com.google.common.collect.Iterables;
-import com.google.common.collect.Lists;
-import com.google.common.io.Files;
-import com.google.common.io.LineReader;
-
-import org.hamcrest.BaseMatcher;
-import org.hamcrest.Description;
-import org.hamcrest.Matchers;
-import org.junit.Before;
-import org.junit.Rule;
-import org.junit.Test;
-import org.junit.rules.TemporaryFolder;
-import org.junit.runner.RunWith;
-import org.junit.runners.JUnit4;
-import org.mockito.Mock;
-import org.mockito.MockitoAnnotations;
-
-import java.io.File;
-import java.io.FileNotFoundException;
-import java.io.IOException;
-import java.nio.channels.Channels;
-import java.nio.channels.Pipe;
-import java.nio.charset.StandardCharsets;
-import java.util.ArrayList;
-import java.util.Arrays;
-import java.util.Collections;
-import java.util.List;
-import java.util.regex.Pattern;
-import java.util.zip.ZipEntry;
-import java.util.zip.ZipInputStream;
-
-/** Tests for PackageUtil. */
-@RunWith(JUnit4.class)
-public class PackageUtilTest {
-  @Rule public ExpectedLogs logged = ExpectedLogs.none(PackageUtil.class);
-  @Rule
-  public TemporaryFolder tmpFolder = new TemporaryFolder();
-
-  @Rule
-  public FastNanoClockAndSleeper fastNanoClockAndSleeper = new 
FastNanoClockAndSleeper();
-
-  @Mock
-  GcsUtil mockGcsUtil;
-
-  // 128 bits, base64 encoded is 171 bits, rounds to 22 bytes
-  private static final String HASH_PATTERN = "[a-zA-Z0-9+-]{22}";
-
-  // Hamcrest matcher to assert a string matches a pattern
-  private static class RegexMatcher ex

[12/21] incubator-beam git commit: Reorganize Java packages in the sources of the Google Cloud Dataflow runner

2016-04-26 Thread dhalperi
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/02190985/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/sdk/runners/DataflowPipelineJob.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/sdk/runners/DataflowPipelineJob.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/sdk/runners/DataflowPipelineJob.java
deleted file mode 100644
index 428a6ed..000
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/sdk/runners/DataflowPipelineJob.java
+++ /dev/null
@@ -1,395 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.runners;
-
-import static org.apache.beam.sdk.util.TimeUtil.fromCloudTime;
-
-import org.apache.beam.sdk.PipelineResult;
-import org.apache.beam.sdk.runners.dataflow.DataflowAggregatorTransforms;
-import org.apache.beam.sdk.runners.dataflow.DataflowMetricUpdateExtractor;
-import org.apache.beam.sdk.transforms.Aggregator;
-import org.apache.beam.sdk.util.AttemptAndTimeBoundedExponentialBackOff;
-import org.apache.beam.sdk.util.AttemptBoundedExponentialBackOff;
-import org.apache.beam.sdk.util.MapAggregatorValues;
-import org.apache.beam.sdk.util.MonitoringUtil;
-
-import com.google.api.client.googleapis.json.GoogleJsonResponseException;
-import com.google.api.client.util.BackOff;
-import com.google.api.client.util.BackOffUtils;
-import com.google.api.client.util.NanoClock;
-import com.google.api.client.util.Sleeper;
-import com.google.api.services.dataflow.Dataflow;
-import com.google.api.services.dataflow.model.Job;
-import com.google.api.services.dataflow.model.JobMessage;
-import com.google.api.services.dataflow.model.JobMetrics;
-import com.google.api.services.dataflow.model.MetricUpdate;
-import com.google.common.annotations.VisibleForTesting;
-import com.google.common.base.Throwables;
-
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
-
-import java.io.IOException;
-import java.net.SocketTimeoutException;
-import java.util.List;
-import java.util.Map;
-import java.util.concurrent.TimeUnit;
-
-import javax.annotation.Nullable;
-
-/**
- * A DataflowPipelineJob represents a job submitted to Dataflow using
- * {@link DataflowPipelineRunner}.
- */
-public class DataflowPipelineJob implements PipelineResult {
-  private static final Logger LOG = 
LoggerFactory.getLogger(DataflowPipelineJob.class);
-
-  /**
-   * The id for the job.
-   */
-  private String jobId;
-
-  /**
-   * Google cloud project to associate this pipeline with.
-   */
-  private String projectId;
-
-  /**
-   * Client for the Dataflow service. This can be used to query the service
-   * for information about the job.
-   */
-  private Dataflow dataflowClient;
-
-  /**
-   * The state the job terminated in or {@code null} if the job has not 
terminated.
-   */
-  @Nullable
-  private State terminalState = null;
-
-  /**
-   * The job that replaced this one or {@code null} if the job has not been 
replaced.
-   */
-  @Nullable
-  private DataflowPipelineJob replacedByJob = null;
-
-  private DataflowAggregatorTransforms aggregatorTransforms;
-
-  /**
-   * The Metric Updates retrieved after the job was in a terminal state.
-   */
-  private List terminalMetricUpdates;
-
-  /**
-   * The polling interval for job status and messages information.
-   */
-  static final long MESSAGES_POLLING_INTERVAL = TimeUnit.SECONDS.toMillis(2);
-  static final long STATUS_POLLING_INTERVAL = TimeUnit.SECONDS.toMillis(2);
-
-  /**
-   * The amount of polling attempts for job status and messages information.
-   */
-  static final int MESSAGES_POLLING_ATTEMPTS = 10;
-  static final int STATUS_POLLING_ATTEMPTS = 5;
-
-  /**
-   * Constructs the job.
-   *
-   * @param projectId the project id
-   * @param jobId the job id
-   * @param dataflowClient the client for the Dataflow Service
-   */
-  public DataflowPipelineJob(String projectId, String jobId, Dataflow 
dataflowClient,
-  DataflowAggregatorTransforms aggregatorTransforms) {
-this.projectId = projectId;
-this.jobId = jobId;
-this.dataflowClient = dataflowClient;
-this.aggregatorTransforms = aggrega

[19/21] incubator-beam git commit: Increase visibility in PAssert and ZipFiles utilities

2016-04-26 Thread dhalperi
Increase visibility in PAssert and ZipFiles utilities

This is needed for package reorganization in runners/google-cloud-dataflow.
Those classes will have to move away from org.apache.beam.sdk.* packages.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/c08f973c
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/c08f973c
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/c08f973c

Branch: refs/heads/master
Commit: c08f973cb5585885a2da93c66716dec87670ca30
Parents: 46f7447
Author: Davor Bonaci 
Authored: Mon Apr 25 14:30:25 2016 -0700
Committer: Davor Bonaci 
Committed: Tue Apr 26 17:59:39 2016 -0700

--
 .../src/main/java/org/apache/beam/sdk/testing/PAssert.java   | 8 
 .../src/main/java/org/apache/beam/sdk/util/ZipFiles.java | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/c08f973c/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/PAssert.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/PAssert.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/PAssert.java
index f328c5b..1265acd 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/PAssert.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/PAssert.java
@@ -98,8 +98,8 @@ public class PAssert {
 
   private static final Logger LOG = LoggerFactory.getLogger(PAssert.class);
 
-  static final String SUCCESS_COUNTER = "PAssertSuccess";
-  static final String FAILURE_COUNTER = "PAssertFailure";
+  public static final String SUCCESS_COUNTER = "PAssertSuccess";
+  public static final String FAILURE_COUNTER = "PAssertFailure";
 
   private static int assertCount = 0;
 
@@ -576,7 +576,7 @@ public class PAssert {
* This is generally useful for assertion functions that
* are serializable but whose underlying data may not have a coder.
*/
-  static class OneSideInputAssert
+  public static class OneSideInputAssert
   extends PTransform implements Serializable {
 private final transient PTransform> 
createActual;
 private final SerializableFunction checkerFn;
@@ -647,7 +647,7 @@ public class PAssert {
* are not serializable, but have coders (provided
* by the underlying {@link PCollection}s).
*/
-  static class TwoSideInputAssert
+  public static class TwoSideInputAssert
   extends PTransform implements Serializable {
 
 private final transient PTransform> 
createActual;

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/c08f973c/sdks/java/core/src/main/java/org/apache/beam/sdk/util/ZipFiles.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/util/ZipFiles.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/util/ZipFiles.java
index 6d73027..038b9cb 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/util/ZipFiles.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/util/ZipFiles.java
@@ -226,7 +226,7 @@ public final class ZipFiles {
* @throws IOException the zipping failed, e.g. because the input was not
* readable.
*/
-  static void zipDirectory(
+  public static void zipDirectory(
   File sourceDirectory,
   OutputStream outputStream) throws IOException {
 checkNotNull(sourceDirectory);



[14/21] incubator-beam git commit: Reorganize Java packages in the sources of the Google Cloud Dataflow runner

2016-04-26 Thread dhalperi
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/02190985/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineDebugOptions.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineDebugOptions.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineDebugOptions.java
new file mode 100644
index 000..71c8a78
--- /dev/null
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineDebugOptions.java
@@ -0,0 +1,247 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.dataflow.options;
+
+import org.apache.beam.runners.dataflow.util.DataflowPathValidator;
+import org.apache.beam.runners.dataflow.util.DataflowTransport;
+import org.apache.beam.runners.dataflow.util.GcsStager;
+import org.apache.beam.runners.dataflow.util.Stager;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.options.Default;
+import org.apache.beam.sdk.options.DefaultValueFactory;
+import org.apache.beam.sdk.options.Description;
+import org.apache.beam.sdk.options.Hidden;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.util.InstanceBuilder;
+import org.apache.beam.sdk.util.PathValidator;
+
+import com.google.api.services.dataflow.Dataflow;
+
+import com.fasterxml.jackson.annotation.JsonIgnore;
+
+import java.util.List;
+import java.util.Map;
+
+/**
+ * Internal. Options used to control execution of the Dataflow SDK for
+ * debugging and testing purposes.
+ */
+@Description("[Internal] Options used to control execution of the Dataflow SDK 
for "
++ "debugging and testing purposes.")
+@Hidden
+public interface DataflowPipelineDebugOptions extends PipelineOptions {
+
+  /**
+   * The list of backend experiments to enable.
+   *
+   * Dataflow provides a number of experimental features that can be enabled
+   * with this flag.
+   *
+   * Please sync with the Dataflow team before enabling any experiments.
+   */
+  @Description("[Experimental] Dataflow provides a number of experimental 
features that can "
+  + "be enabled with this flag. Please sync with the Dataflow team before 
enabling any "
+  + "experiments.")
+  @Experimental
+  List getExperiments();
+  void setExperiments(List value);
+
+  /**
+   * The root URL for the Dataflow API. {@code dataflowEndpoint} can override 
this value
+   * if it contains an absolute URL, otherwise {@code apiRootUrl} will be 
combined with
+   * {@code dataflowEndpoint} to generate the full URL to communicate with the 
Dataflow API.
+   */
+  @Description("The root URL for the Dataflow API. dataflowEndpoint can 
override this "
+  + "value if it contains an absolute URL, otherwise apiRootUrl will be 
combined with "
+  + "dataflowEndpoint to generate the full URL to communicate with the 
Dataflow API.")
+  @Default.String(Dataflow.DEFAULT_ROOT_URL)
+  String getApiRootUrl();
+  void setApiRootUrl(String value);
+
+  /**
+   * Dataflow endpoint to use.
+   *
+   * Defaults to the current version of the Google Cloud Dataflow
+   * API, at the time the current SDK version was released.
+   *
+   * If the string contains "://", then this is treated as a URL,
+   * otherwise {@link #getApiRootUrl()} is used as the root
+   * URL.
+   */
+  @Description("The URL for the Dataflow API. If the string contains \"://\", 
this"
+  + " will be treated as the entire URL, otherwise will be treated 
relative to apiRootUrl.")
+  @Default.String(Dataflow.DEFAULT_SERVICE_PATH)
+  String getDataflowEndpoint();
+  void setDataflowEndpoint(String value);
+
+  /**
+   * The path to write the translated Dataflow job specification out to
+   * at job submission time. The Dataflow job specification will be 
represented in JSON
+   * format.
+   */
+  @Description("The path to write the translated Dataflow job specification 
out to "
+  + "at job submission time. The Dataflow job specification will be 
represented in JSON "
+  + "forma

[21/21] incubator-beam git commit: Closes #239

2016-04-26 Thread dhalperi
Closes #239


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/e3105c8e
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/e3105c8e
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/e3105c8e

Branch: refs/heads/master
Commit: e3105c8e109535f801fd145b91b0c7aa93b86d1a
Parents: 5e3d7ad 0fafd4e
Author: Dan Halperin 
Authored: Tue Apr 26 18:07:43 2016 -0700
Committer: Dan Halperin 
Committed: Tue Apr 26 18:07:43 2016 -0700

--
 .../apache/beam/examples/MinimalWordCount.java  |4 +-
 .../org/apache/beam/examples/WordCount.java |2 +-
 .../examples/common/DataflowExampleOptions.java |2 +-
 .../examples/common/DataflowExampleUtils.java   |8 +-
 .../common/ExampleBigQueryTableOptions.java |2 +-
 ...xamplePubsubTopicAndSubscriptionOptions.java |2 +-
 .../common/ExamplePubsubTopicOptions.java   |2 +-
 .../examples/common/PubsubFileInjector.java |2 +-
 .../beam/examples/complete/AutoComplete.java|2 +-
 .../examples/complete/StreamingWordExtract.java |2 +-
 .../examples/complete/TopWikipediaSessions.java |2 +-
 .../beam/examples/cookbook/DeDupExample.java|2 +-
 .../beam/examples/cookbook/TriggerExample.java  |4 +-
 .../org/apache/beam/examples/WordCountIT.java   |4 +-
 .../beam/examples/MinimalWordCountJava8.java|4 +-
 .../beam/examples/complete/game/GameStats.java  |   14 +-
 .../examples/complete/game/HourlyTeamScore.java |6 +-
 .../examples/complete/game/LeaderBoard.java |8 +-
 .../beam/runners/flink/examples/TFIDF.java  |1 +
 .../beam/runners/flink/examples/WordCount.java  |9 +-
 .../flink/examples/streaming/AutoComplete.java  |   21 +-
 .../flink/examples/streaming/JoinExamples.java  |6 +-
 .../KafkaWindowedWordCountExample.java  |   11 +-
 .../examples/streaming/WindowedWordCount.java   |   13 +-
 .../runners/flink/FlinkPipelineOptions.java |2 +-
 .../beam/runners/flink/FlinkPipelineRunner.java |4 +-
 .../FlinkBatchPipelineTranslator.java   |2 +-
 .../FlinkStreamingPipelineTranslator.java   |2 +-
 runners/google-cloud-dataflow-java/pom.xml  |4 +-
 .../BlockingDataflowPipelineRunner.java |  186 ++
 .../DataflowJobAlreadyExistsException.java  |   35 +
 .../DataflowJobAlreadyUpdatedException.java |   34 +
 .../dataflow/DataflowJobCancelledException.java |   39 +
 .../runners/dataflow/DataflowJobException.java  |   41 +
 .../dataflow/DataflowJobExecutionException.java |   35 +
 .../dataflow/DataflowJobUpdatedException.java   |   51 +
 .../runners/dataflow/DataflowPipelineJob.java   |  397 +++
 .../dataflow/DataflowPipelineRegistrar.java |   62 +
 .../dataflow/DataflowPipelineRunner.java| 3025 ++
 .../dataflow/DataflowPipelineRunnerHooks.java   |   39 +
 .../dataflow/DataflowPipelineTranslator.java| 1059 ++
 .../dataflow/DataflowServiceException.java  |   33 +
 .../dataflow/internal/AssignWindows.java|   89 +
 .../dataflow/internal/BigQueryIOTranslator.java |   72 +
 .../dataflow/internal/CustomSources.java|  121 +
 .../internal/DataflowAggregatorTransforms.java  |   81 +
 .../internal/DataflowMetricUpdateExtractor.java |  111 +
 .../dataflow/internal/PubsubIOTranslator.java   |  108 +
 .../dataflow/internal/ReadTranslator.java   |  105 +
 .../runners/dataflow/internal/package-info.java |   21 +
 .../BlockingDataflowPipelineOptions.java|   55 +
 .../dataflow/options/CloudDebuggerOptions.java  |   56 +
 .../options/DataflowPipelineDebugOptions.java   |  247 ++
 .../options/DataflowPipelineOptions.java|  126 +
 .../DataflowPipelineWorkerPoolOptions.java  |  263 ++
 .../options/DataflowProfilingOptions.java   |   50 +
 .../options/DataflowWorkerHarnessOptions.java   |   55 +
 .../options/DataflowWorkerLoggingOptions.java   |  159 +
 .../testing/TestDataflowPipelineOptions.java|   30 +
 .../testing/TestDataflowPipelineRunner.java |  273 ++
 .../dataflow/util/DataflowPathValidator.java|  100 +
 .../dataflow/util/DataflowTransport.java|  114 +
 .../beam/runners/dataflow/util/GcsStager.java   |   55 +
 .../runners/dataflow/util/MonitoringUtil.java   |  237 ++
 .../beam/runners/dataflow/util/PackageUtil.java |  333 ++
 .../beam/runners/dataflow/util/Stager.java  |   30 +
 .../BlockingDataflowPipelineOptions.java|   50 -
 .../beam/sdk/options/CloudDebuggerOptions.java  |   53 -
 .../options/DataflowPipelineDebugOptions.java   |  242 --
 .../sdk/options/DataflowPipelineOptions.java|  115 -
 .../DataflowPipelineWorkerPoolOptions.java  |  258 --
 .../sdk/options/DataflowProfilingOptions.java   |   48 -
 .../options/DataflowWorkerHarnessOptions.java   |   51 -
 .../options/DataflowWorkerLoggingOptions.java   |  155 -
 .../runners/BlockingDataflowPipelin

[18/21] incubator-beam git commit: Update pom.xml for java8tests

2016-04-26 Thread dhalperi
Update pom.xml for java8tests

Java8tests module doesn't have sources, only tests. Hence, all dependencies
should have scope of test. If not, dependency analysis correctly finds unused
dependencies.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/46f74471
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/46f74471
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/46f74471

Branch: refs/heads/master
Commit: 46f744714aae4cd45ba6284e2c9414e4bba5cd3e
Parents: 5e3d7ad
Author: Davor Bonaci 
Authored: Mon Apr 25 14:16:19 2016 -0700
Committer: Davor Bonaci 
Committed: Tue Apr 26 17:59:39 2016 -0700

--
 sdks/java/java8tests/pom.xml | 3 +++
 1 file changed, 3 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/46f74471/sdks/java/java8tests/pom.xml
--
diff --git a/sdks/java/java8tests/pom.xml b/sdks/java/java8tests/pom.xml
index cd7174a..f750a1c 100644
--- a/sdks/java/java8tests/pom.xml
+++ b/sdks/java/java8tests/pom.xml
@@ -150,18 +150,21 @@
   org.apache.beam
   java-sdk-all
   ${project.version}
+  test
 
 
 
   com.google.guava
   guava
   ${guava.version}
+  test
 
 
 
   joda-time
   joda-time
   ${joda.version}
+  test
 
 
 



[13/21] incubator-beam git commit: Reorganize Java packages in the sources of the Google Cloud Dataflow runner

2016-04-26 Thread dhalperi
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/02190985/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java
new file mode 100644
index 000..cff7e2b
--- /dev/null
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java
@@ -0,0 +1,333 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.dataflow.util;
+
+import org.apache.beam.sdk.util.AttemptBoundedExponentialBackOff;
+import org.apache.beam.sdk.util.IOChannelUtils;
+import org.apache.beam.sdk.util.MimeTypes;
+import org.apache.beam.sdk.util.ZipFiles;
+
+import com.google.api.client.util.BackOffUtils;
+import com.google.api.client.util.Sleeper;
+import com.google.api.services.dataflow.model.DataflowPackage;
+import com.google.cloud.hadoop.util.ApiErrorExtractor;
+import com.google.common.hash.Funnels;
+import com.google.common.hash.Hasher;
+import com.google.common.hash.Hashing;
+import com.google.common.io.CountingOutputStream;
+import com.google.common.io.Files;
+
+import com.fasterxml.jackson.core.Base64Variants;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.File;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.io.OutputStream;
+import java.nio.channels.Channels;
+import java.nio.channels.WritableByteChannel;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.List;
+import java.util.Objects;
+
+/** Helper routines for packages. */
+public class PackageUtil {
+  private static final Logger LOG = LoggerFactory.getLogger(PackageUtil.class);
+  /**
+   * A reasonable upper bound on the number of jars required to launch a 
Dataflow job.
+   */
+  public static final int SANE_CLASSPATH_SIZE = 1000;
+  /**
+   * The initial interval to use between package staging attempts.
+   */
+  private static final long INITIAL_BACKOFF_INTERVAL_MS = 5000L;
+  /**
+   * The maximum number of attempts when staging a file.
+   */
+  private static final int MAX_ATTEMPTS = 5;
+
+  /**
+   * Translates exceptions from API calls.
+   */
+  private static final ApiErrorExtractor ERROR_EXTRACTOR = new 
ApiErrorExtractor();
+
+  /**
+   * Creates a DataflowPackage containing information about how a classpath 
element should be
+   * staged, including the staging destination as well as its size and hash.
+   *
+   * @param classpathElement The local path for the classpath element.
+   * @param stagingPath The base location for staged classpath elements.
+   * @param overridePackageName If non-null, use the given value as the 
package name
+   *instead of generating one automatically.
+   * @return The package.
+   */
+  @Deprecated
+  public static DataflowPackage createPackage(File classpathElement,
+  String stagingPath, String overridePackageName) {
+return createPackageAttributes(classpathElement, stagingPath, 
overridePackageName)
+.getDataflowPackage();
+  }
+
+  /**
+   * Compute and cache the attributes of a classpath element that we will need 
to stage it.
+   *
+   * @param classpathElement the file or directory to be staged.
+   * @param stagingPath The base location for staged classpath elements.
+   * @param overridePackageName If non-null, use the given value as the 
package name
+   *instead of generating one automatically.
+   * @return a {@link PackageAttributes} that containing metadata about the 
object to be staged.
+   */
+  static PackageAttributes createPackageAttributes(File classpathElement,
+  String stagingPath, String overridePackageName) {
+try {
+  boolean directory = classpathElement.isDirectory();
+
+  // Compute size and hash in one pass over file or directory.
+  Hasher hasher = Hashing.md5().newHasher();
+  OutputStream hashStream = Funnels.asOutputStream(hasher

[04/21] incubator-beam git commit: Reorganize Java packages in the sources of the Google Cloud Dataflow runner

2016-04-26 Thread dhalperi
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/02190985/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/sdk/runners/DataflowPipelineRunnerTest.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/sdk/runners/DataflowPipelineRunnerTest.java
 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/sdk/runners/DataflowPipelineRunnerTest.java
deleted file mode 100644
index f888c5b..000
--- 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/sdk/runners/DataflowPipelineRunnerTest.java
+++ /dev/null
@@ -1,1400 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.runners;
-
-import static org.apache.beam.sdk.util.WindowedValue.valueInGlobalWindow;
-
-import static org.hamcrest.Matchers.containsInAnyOrder;
-import static org.hamcrest.Matchers.containsString;
-import static org.hamcrest.Matchers.equalTo;
-import static org.hamcrest.Matchers.hasItem;
-import static org.hamcrest.Matchers.instanceOf;
-import static org.hamcrest.Matchers.startsWith;
-import static org.hamcrest.collection.IsIterableContainingInOrder.contains;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertNotNull;
-import static org.junit.Assert.assertNull;
-import static org.junit.Assert.assertThat;
-import static org.junit.Assert.assertTrue;
-import static org.junit.Assert.fail;
-import static org.mockito.Matchers.any;
-import static org.mockito.Matchers.anyString;
-import static org.mockito.Matchers.eq;
-import static org.mockito.Mockito.mock;
-import static org.mockito.Mockito.when;
-
-import org.apache.beam.sdk.Pipeline;
-import org.apache.beam.sdk.Pipeline.PipelineVisitor;
-import org.apache.beam.sdk.coders.BigEndianIntegerCoder;
-import org.apache.beam.sdk.coders.BigEndianLongCoder;
-import org.apache.beam.sdk.coders.Coder;
-import org.apache.beam.sdk.coders.VarLongCoder;
-import org.apache.beam.sdk.io.AvroIO;
-import org.apache.beam.sdk.io.AvroSource;
-import org.apache.beam.sdk.io.BigQueryIO;
-import org.apache.beam.sdk.io.Read;
-import org.apache.beam.sdk.io.TextIO;
-import org.apache.beam.sdk.options.DataflowPipelineDebugOptions;
-import org.apache.beam.sdk.options.DataflowPipelineOptions;
-import org.apache.beam.sdk.options.PipelineOptions;
-import org.apache.beam.sdk.options.PipelineOptions.CheckEnabled;
-import org.apache.beam.sdk.options.PipelineOptionsFactory;
-import org.apache.beam.sdk.runners.DataflowPipelineRunner.BatchViewAsList;
-import org.apache.beam.sdk.runners.DataflowPipelineRunner.BatchViewAsMap;
-import org.apache.beam.sdk.runners.DataflowPipelineRunner.BatchViewAsMultimap;
-import org.apache.beam.sdk.runners.DataflowPipelineRunner.TransformedMap;
-import org.apache.beam.sdk.runners.dataflow.TestCountingSource;
-import org.apache.beam.sdk.runners.worker.IsmFormat;
-import org.apache.beam.sdk.runners.worker.IsmFormat.IsmRecord;
-import org.apache.beam.sdk.runners.worker.IsmFormat.IsmRecordCoder;
-import org.apache.beam.sdk.runners.worker.IsmFormat.MetadataKeyCoder;
-import org.apache.beam.sdk.transforms.Create;
-import org.apache.beam.sdk.transforms.DoFnTester;
-import org.apache.beam.sdk.transforms.PTransform;
-import org.apache.beam.sdk.transforms.windowing.GlobalWindow;
-import org.apache.beam.sdk.transforms.windowing.IntervalWindow;
-import org.apache.beam.sdk.transforms.windowing.PaneInfo;
-import org.apache.beam.sdk.util.CoderUtils;
-import org.apache.beam.sdk.util.GcsUtil;
-import org.apache.beam.sdk.util.NoopPathValidator;
-import org.apache.beam.sdk.util.ReleaseInfo;
-import org.apache.beam.sdk.util.TestCredential;
-import org.apache.beam.sdk.util.UserCodeException;
-import org.apache.beam.sdk.util.WindowedValue;
-import org.apache.beam.sdk.util.WindowedValue.FullWindowedValueCoder;
-import org.apache.beam.sdk.util.WindowingStrategy;
-import org.apache.beam.sdk.util.gcsfs.GcsPath;
-import org.apache.beam.sdk.values.KV;
-import org.apache.beam.sdk.values.PCollection;
-import org.apache.beam.sdk.values.PDone;
-import org.apache.beam.sdk.values.PInput;
-import org.apache.beam.sdk.values.PValue;
-import org.apache.beam.sdk.values.Tim

[GitHub] incubator-beam pull request: [BEAM-77] Reorganize Java packages in...

2016-04-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/239


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[15/21] incubator-beam git commit: Reorganize Java packages in the sources of the Google Cloud Dataflow runner

2016-04-26 Thread dhalperi
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/02190985/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineRunnerHooks.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineRunnerHooks.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineRunnerHooks.java
new file mode 100644
index 000..4d37966
--- /dev/null
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineRunnerHooks.java
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.dataflow;
+
+import org.apache.beam.sdk.annotations.Experimental;
+
+import com.google.api.services.dataflow.model.Environment;
+
+/**
+ * An instance of this class can be passed to the
+ * {@link DataflowPipelineRunner} to add user defined hooks to be
+ * invoked at various times during pipeline execution.
+ */
+@Experimental
+public class DataflowPipelineRunnerHooks {
+  /**
+   * Allows the user to modify the environment of their job before their job 
is submitted
+   * to the service for execution.
+   *
+   * @param environment The environment of the job. Users can make change to 
this instance in order
+   * to change the environment with which their job executes on the 
service.
+   */
+  public void modifyEnvironmentBeforeSubmission(Environment environment) {}
+}

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/02190985/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java
new file mode 100644
index 000..0f2d325
--- /dev/null
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java
@@ -0,0 +1,1059 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.dataflow;
+
+import static org.apache.beam.sdk.util.SerializableUtils.serializeToByteArray;
+import static org.apache.beam.sdk.util.StringUtils.byteArrayToJsonString;
+import static org.apache.beam.sdk.util.StringUtils.jsonStringToByteArray;
+import static org.apache.beam.sdk.util.Structs.addBoolean;
+import static org.apache.beam.sdk.util.Structs.addDictionary;
+import static org.apache.beam.sdk.util.Structs.addList;
+import static org.apache.beam.sdk.util.Structs.addLong;
+import static org.apache.beam.sdk.util.Structs.addObject;
+import static org.apache.beam.sdk.util.Structs.addString;
+import static org.apache.beam.sdk.util.Structs.getString;
+
+import static com.google.common.base.Preconditions.checkArgument;
+
+import 
org.apache.beam.runners.dataflow.DataflowPipelineRunner.GroupByKeyAndSortValuesOnly;
+import org.apache.beam.runners.dataflow.internal.BigQueryIOTranslator;
+import org.apache.beam.runners.dataflow.internal.PubsubIOTranslator;
+import org.apache.beam.runners.dataflow.internal.ReadTranslator;
+import org.apache.beam.runners.dataflow.options.DataflowPipelineOptions;
+

[10/21] incubator-beam git commit: Reorganize Java packages in the sources of the Google Cloud Dataflow runner

2016-04-26 Thread dhalperi
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/02190985/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/sdk/runners/DataflowPipelineRunnerHooks.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/sdk/runners/DataflowPipelineRunnerHooks.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/sdk/runners/DataflowPipelineRunnerHooks.java
deleted file mode 100644
index 7ea44d7..000
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/sdk/runners/DataflowPipelineRunnerHooks.java
+++ /dev/null
@@ -1,39 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.runners;
-
-import org.apache.beam.sdk.annotations.Experimental;
-
-import com.google.api.services.dataflow.model.Environment;
-
-/**
- * An instance of this class can be passed to the
- * {@link DataflowPipelineRunner} to add user defined hooks to be
- * invoked at various times during pipeline execution.
- */
-@Experimental
-public class DataflowPipelineRunnerHooks {
-  /**
-   * Allows the user to modify the environment of their job before their job 
is submitted
-   * to the service for execution.
-   *
-   * @param environment The environment of the job. Users can make change to 
this instance in order
-   * to change the environment with which their job executes on the 
service.
-   */
-  public void modifyEnvironmentBeforeSubmission(Environment environment) {}
-}

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/02190985/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/sdk/runners/DataflowPipelineTranslator.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/sdk/runners/DataflowPipelineTranslator.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/sdk/runners/DataflowPipelineTranslator.java
deleted file mode 100644
index 5c0745f..000
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/sdk/runners/DataflowPipelineTranslator.java
+++ /dev/null
@@ -1,1058 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.runners;
-
-import static org.apache.beam.sdk.util.SerializableUtils.serializeToByteArray;
-import static org.apache.beam.sdk.util.StringUtils.byteArrayToJsonString;
-import static org.apache.beam.sdk.util.StringUtils.jsonStringToByteArray;
-import static org.apache.beam.sdk.util.Structs.addBoolean;
-import static org.apache.beam.sdk.util.Structs.addDictionary;
-import static org.apache.beam.sdk.util.Structs.addList;
-import static org.apache.beam.sdk.util.Structs.addLong;
-import static org.apache.beam.sdk.util.Structs.addObject;
-import static org.apache.beam.sdk.util.Structs.addString;
-import static org.apache.beam.sdk.util.Structs.getString;
-
-import static com.google.common.base.Preconditions.checkArgument;
-
-import org.apache.beam.sdk.Pipeline;
-import org.apache.beam.sdk.Pipeline.PipelineVisitor;
-import org.apache.beam.sdk.coders.Coder;
-import org.apache.beam.sdk.coders.IterableCoder;
-import org.apache.beam.sdk.io.BigQueryIO;
-import org.apache.beam.sdk.io.PubsubIO;
-import org.apache.beam.sdk.io.Read;
-import org.apache.beam.sdk.options.DataflowPipelineOptions;
-import org.apache.beam.sdk.options.StreamingOptions;

[09/21] incubator-beam git commit: Reorganize Java packages in the sources of the Google Cloud Dataflow runner

2016-04-26 Thread dhalperi
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/02190985/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/sdk/testing/TestDataflowPipelineRunner.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/sdk/testing/TestDataflowPipelineRunner.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/sdk/testing/TestDataflowPipelineRunner.java
deleted file mode 100644
index d647b0d..000
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/sdk/testing/TestDataflowPipelineRunner.java
+++ /dev/null
@@ -1,271 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.testing;
-
-import org.apache.beam.sdk.Pipeline;
-import org.apache.beam.sdk.PipelineResult;
-import org.apache.beam.sdk.PipelineResult.State;
-import org.apache.beam.sdk.options.PipelineOptions;
-import org.apache.beam.sdk.runners.DataflowJobExecutionException;
-import org.apache.beam.sdk.runners.DataflowPipelineJob;
-import org.apache.beam.sdk.runners.DataflowPipelineRunner;
-import org.apache.beam.sdk.runners.PipelineRunner;
-import org.apache.beam.sdk.transforms.PTransform;
-import org.apache.beam.sdk.util.MonitoringUtil;
-import org.apache.beam.sdk.util.MonitoringUtil.JobMessagesHandler;
-import org.apache.beam.sdk.values.PInput;
-import org.apache.beam.sdk.values.POutput;
-
-import com.google.api.services.dataflow.model.JobMessage;
-import com.google.api.services.dataflow.model.JobMetrics;
-import com.google.api.services.dataflow.model.MetricUpdate;
-import com.google.common.base.Joiner;
-import com.google.common.base.Optional;
-import com.google.common.base.Throwables;
-
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
-
-import java.io.IOException;
-import java.math.BigDecimal;
-import java.util.List;
-import java.util.Map;
-import java.util.concurrent.Callable;
-import java.util.concurrent.ConcurrentHashMap;
-import java.util.concurrent.ExecutionException;
-import java.util.concurrent.Future;
-import java.util.concurrent.TimeUnit;
-
-/**
- * {@link TestDataflowPipelineRunner} is a pipeline runner that wraps a
- * {@link DataflowPipelineRunner} when running tests against the {@link 
TestPipeline}.
- *
- * @see TestPipeline
- */
-public class TestDataflowPipelineRunner extends 
PipelineRunner {
-  private static final String TENTATIVE_COUNTER = "tentative";
-  private static final Logger LOG = 
LoggerFactory.getLogger(TestDataflowPipelineRunner.class);
-  private static final Map EXECUTION_RESULTS =
-  new ConcurrentHashMap();
-
-  private final TestDataflowPipelineOptions options;
-  private final DataflowPipelineRunner runner;
-  private int expectedNumberOfAssertions = 0;
-
-  TestDataflowPipelineRunner(TestDataflowPipelineOptions options) {
-this.options = options;
-this.runner = DataflowPipelineRunner.fromOptions(options);
-  }
-
-  /**
-   * Constructs a runner from the provided options.
-   */
-  public static TestDataflowPipelineRunner fromOptions(
-  PipelineOptions options) {
-TestDataflowPipelineOptions dataflowOptions = 
options.as(TestDataflowPipelineOptions.class);
-dataflowOptions.setStagingLocation(Joiner.on("/").join(
-new String[]{dataflowOptions.getTempRoot(),
-  dataflowOptions.getJobName(), "output", "results"}));
-
-return new TestDataflowPipelineRunner(dataflowOptions);
-  }
-
-  public static PipelineResult getPipelineResultByJobName(String jobName) {
-return EXECUTION_RESULTS.get(jobName);
-  }
-
-  @Override
-  public DataflowPipelineJob run(Pipeline pipeline) {
-return run(pipeline, runner);
-  }
-
-  DataflowPipelineJob run(Pipeline pipeline, DataflowPipelineRunner runner) {
-
-final DataflowPipelineJob job;
-try {
-  job = runner.run(pipeline);
-} catch (DataflowJobExecutionException ex) {
-  throw new IllegalStateException("The dataflow failed.");
-}
-
-LOG.info("Running Dataflow job {} with {} expected assertions.",
-job.getJobId(), expectedNumberOfAssertions);
-
-CancelWorkflowOnError messageHandler = new CancelWorkflowOnError(
-job, new MonitoringUtil.Print

[03/21] incubator-beam git commit: Reorganize Java packages in the sources of the Google Cloud Dataflow runner

2016-04-26 Thread dhalperi
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/02190985/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/sdk/runners/DataflowPipelineTranslatorTest.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/sdk/runners/DataflowPipelineTranslatorTest.java
 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/sdk/runners/DataflowPipelineTranslatorTest.java
deleted file mode 100644
index 27c0acc..000
--- 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/sdk/runners/DataflowPipelineTranslatorTest.java
+++ /dev/null
@@ -1,965 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.runners;
-
-import static org.apache.beam.sdk.util.Structs.addObject;
-import static org.apache.beam.sdk.util.Structs.getDictionary;
-import static org.apache.beam.sdk.util.Structs.getString;
-
-import static org.hamcrest.Matchers.containsString;
-import static org.hamcrest.Matchers.hasKey;
-import static org.hamcrest.core.IsInstanceOf.instanceOf;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertNull;
-import static org.junit.Assert.assertThat;
-import static org.junit.Assert.assertTrue;
-import static org.mockito.Matchers.any;
-import static org.mockito.Matchers.anyString;
-import static org.mockito.Matchers.argThat;
-import static org.mockito.Matchers.eq;
-import static org.mockito.Mockito.mock;
-import static org.mockito.Mockito.when;
-
-import org.apache.beam.sdk.Pipeline;
-import org.apache.beam.sdk.coders.Coder;
-import org.apache.beam.sdk.coders.StringUtf8Coder;
-import org.apache.beam.sdk.coders.VarIntCoder;
-import org.apache.beam.sdk.coders.VoidCoder;
-import org.apache.beam.sdk.io.TextIO;
-import org.apache.beam.sdk.options.DataflowPipelineOptions;
-import org.apache.beam.sdk.options.DataflowPipelineWorkerPoolOptions;
-import org.apache.beam.sdk.options.PipelineOptionsFactory;
-import 
org.apache.beam.sdk.runners.DataflowPipelineTranslator.TranslationContext;
-import org.apache.beam.sdk.transforms.Count;
-import org.apache.beam.sdk.transforms.Create;
-import org.apache.beam.sdk.transforms.DoFn;
-import org.apache.beam.sdk.transforms.PTransform;
-import org.apache.beam.sdk.transforms.ParDo;
-import org.apache.beam.sdk.transforms.Sum;
-import org.apache.beam.sdk.transforms.View;
-import org.apache.beam.sdk.transforms.display.DisplayData;
-import org.apache.beam.sdk.util.GcsUtil;
-import org.apache.beam.sdk.util.OutputReference;
-import org.apache.beam.sdk.util.PropertyNames;
-import org.apache.beam.sdk.util.Structs;
-import org.apache.beam.sdk.util.TestCredential;
-import org.apache.beam.sdk.util.WindowingStrategy;
-import org.apache.beam.sdk.util.gcsfs.GcsPath;
-import org.apache.beam.sdk.values.PCollection;
-import org.apache.beam.sdk.values.PCollectionTuple;
-import org.apache.beam.sdk.values.PDone;
-import org.apache.beam.sdk.values.TupleTag;
-
-import com.google.api.services.dataflow.Dataflow;
-import com.google.api.services.dataflow.model.DataflowPackage;
-import com.google.api.services.dataflow.model.Job;
-import com.google.api.services.dataflow.model.Step;
-import com.google.api.services.dataflow.model.WorkerPool;
-import com.google.common.collect.ImmutableList;
-import com.google.common.collect.ImmutableMap;
-import com.google.common.collect.ImmutableSet;
-import com.google.common.collect.Iterables;
-
-import org.hamcrest.Matchers;
-import org.junit.Assert;
-import org.junit.Rule;
-import org.junit.Test;
-import org.junit.internal.matchers.ThrowableMessageMatcher;
-import org.junit.rules.ExpectedException;
-import org.junit.runner.RunWith;
-import org.junit.runners.JUnit4;
-import org.mockito.ArgumentMatcher;
-import org.mockito.invocation.InvocationOnMock;
-import org.mockito.stubbing.Answer;
-
-import java.io.IOException;
-import java.io.Serializable;
-import java.util.Collection;
-import java.util.Collections;
-import java.util.HashMap;
-import java.util.LinkedList;
-import java.util.List;
-import java.util.Map;
-
-/**
- * Tests for DataflowPipelineTranslator.
- */
-@RunWith(JUnit4.class)
-public class DataflowPipelineTranslatorTest 

[jira] [Commented] (BEAM-222) TestCounterSource does not implement checkpointing contract

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15259237#comment-15259237
 ] 

ASF GitHub Bot commented on BEAM-222:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/248


> TestCounterSource does not implement checkpointing contract
> ---
>
> Key: BEAM-222
> URL: https://issues.apache.org/jira/browse/BEAM-222
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Mark Shields
>Assignee: Mark Shields
>
> It will yield repeated values when restored from a checkpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/2] incubator-beam git commit: TestCountingSource: Don't advance beyond last valid value

2016-04-26 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/master 7bd2d365a -> 5e3d7ad20


TestCountingSource: Don't advance beyond last valid value


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/ad592442
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/ad592442
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/ad592442

Branch: refs/heads/master
Commit: ad5924426a40ac7e8498e3a3937b3b75d7b8b932
Parents: 7bd2d36
Author: Mark Shields 
Authored: Tue Apr 26 16:34:32 2016 -0700
Committer: Dan Halperin 
Committed: Tue Apr 26 17:08:43 2016 -0700

--
 .../sdk/runners/dataflow/TestCountingSource.java  |  4 ++--
 .../runners/dataflow/TestCountingSourceTest.java  | 18 ++
 2 files changed, 20 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/ad592442/sdks/java/core/src/test/java/org/apache/beam/sdk/runners/dataflow/TestCountingSource.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/runners/dataflow/TestCountingSource.java
 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/runners/dataflow/TestCountingSource.java
index 226b3cb..10631c2 100644
--- 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/runners/dataflow/TestCountingSource.java
+++ 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/runners/dataflow/TestCountingSource.java
@@ -173,7 +173,7 @@ public class TestCountingSource
 
 @Override
 public boolean advance() {
-  if (current >= numMessagesPerShard) {
+  if (current >= numMessagesPerShard - 1) {
 return false;
   }
   // If testing dedup, occasionally insert a duplicate value;
@@ -181,7 +181,7 @@ public class TestCountingSource
 return true;
   }
   current++;
-  return current < numMessagesPerShard;
+  return true;
 }
 
 @Override

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/ad592442/sdks/java/core/src/test/java/org/apache/beam/sdk/runners/dataflow/TestCountingSourceTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/runners/dataflow/TestCountingSourceTest.java
 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/runners/dataflow/TestCountingSourceTest.java
index 9377905..6ba060e 100644
--- 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/runners/dataflow/TestCountingSourceTest.java
+++ 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/runners/dataflow/TestCountingSourceTest.java
@@ -53,4 +53,22 @@ public class TestCountingSourceTest {
 assertEquals(2L, (long) reader.getCurrent().getValue());
 assertFalse(reader.advance());
   }
+
+  @Test
+  public void testCanResumeWithExpandedCount() throws IOException {
+TestCountingSource source = new TestCountingSource(1);
+PipelineOptions options = PipelineOptionsFactory.create();
+TestCountingSource.CountingSourceReader reader =
+source.createReader(options, null /* no checkpoint */);
+assertTrue(reader.start());
+assertEquals(0L, (long) reader.getCurrent().getValue());
+assertFalse(reader.advance());
+TestCountingSource.CounterMark checkpoint = reader.getCheckpointMark();
+checkpoint.finalizeCheckpoint();
+source = new TestCountingSource(2);
+reader = source.createReader(options, checkpoint);
+assertTrue(reader.start());
+assertEquals(1L, (long) reader.getCurrent().getValue());
+assertFalse(reader.advance());
+  }
 }



[GitHub] incubator-beam pull request: [BEAM-222] Don't advance beyond last ...

2016-04-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/248


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] incubator-beam git commit: Closes #248

2016-04-26 Thread dhalperi
Closes #248


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/5e3d7ad2
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/5e3d7ad2
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/5e3d7ad2

Branch: refs/heads/master
Commit: 5e3d7ad204c628871dccef2d4befdf0dd1fc2137
Parents: 7bd2d36 ad59244
Author: Dan Halperin 
Authored: Tue Apr 26 17:08:53 2016 -0700
Committer: Dan Halperin 
Committed: Tue Apr 26 17:08:53 2016 -0700

--
 .../sdk/runners/dataflow/TestCountingSource.java  |  4 ++--
 .../runners/dataflow/TestCountingSourceTest.java  | 18 ++
 2 files changed, 20 insertions(+), 2 deletions(-)
--




[jira] [Commented] (BEAM-22) DirectPipelineRunner: support for unbounded collections

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15259196#comment-15259196
 ] 

ASF GitHub Bot commented on BEAM-22:


GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/249

[BEAM-22] Return a map of CommittedBundle to Consumers from handleResult

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This allows the executor to be ignorant of the mapping from PValue to
Consumers, as well as allowing the TransformExecutor to pass bundles
that should only be consumed by specific PTransforms. This can occur if
a transform is incapable of processing a bundle.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam 
ippr_add_consumers_of_completed_transforms

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/249.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #249


commit 64f6b15b63c571d438be2cbc918580ccbbefdce0
Author: Thomas Groh 
Date:   2016-04-26T23:40:58Z

Return a map of CommittedBundle to Consumers from handleResult

This allows the executor to be ignorant of the mapping from PValue to
Consumers, as well as allowing the TransformExecutor to pass bundles
that should only be consumed by specific PTransforms. This can occur if
a transform is incapable of processing a bundle.




> DirectPipelineRunner: support for unbounded collections
> ---
>
> Key: BEAM-22
> URL: https://issues.apache.org/jira/browse/BEAM-22
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Davor Bonaci
>Assignee: Thomas Groh
>
> DirectPipelineRunner currently runs over bounded PCollections only, and 
> implements only a portion of the Beam Model.
> We should improve it to faithfully implement the full Beam Model, such as add 
> ability to run over unbounded PCollections, and better resemble execution 
> model in a distributed system.
> This further enables features such as a testing source which may simulate 
> late data and test triggers in the pipeline. Finally, we may want to expose 
> an option to select between "debug" (single threaded), "chaos monkey" (test 
> as many model requirements as possible), and "performance" (multi-threaded).
> more testing (chaos monkey) 
> Once this is done, we should update this StackOverflow question:
> http://stackoverflow.com/questions/35350113/testing-triggers-with-processing-time/35401426#35401426



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: [BEAM-22] Return a map of CommittedBu...

2016-04-26 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/249

[BEAM-22] Return a map of CommittedBundle to Consumers from handleResult

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This allows the executor to be ignorant of the mapping from PValue to
Consumers, as well as allowing the TransformExecutor to pass bundles
that should only be consumed by specific PTransforms. This can occur if
a transform is incapable of processing a bundle.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam 
ippr_add_consumers_of_completed_transforms

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/249.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #249


commit 64f6b15b63c571d438be2cbc918580ccbbefdce0
Author: Thomas Groh 
Date:   2016-04-26T23:40:58Z

Return a map of CommittedBundle to Consumers from handleResult

This allows the executor to be ignorant of the mapping from PValue to
Consumers, as well as allowing the TransformExecutor to pass bundles
that should only be consumed by specific PTransforms. This can occur if
a transform is incapable of processing a bundle.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-222) TestCounterSource does not implement checkpointing contract

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15259187#comment-15259187
 ] 

ASF GitHub Bot commented on BEAM-222:
-

GitHub user mshields822 opened a pull request:

https://github.com/apache/incubator-beam/pull/248

[BEAM-222] Don't advance beyond last valid value

Turns out there was a bug with allowing 'advanced beyond last element' to 
be a valid counter state.
Only an internal reload test caught it.
Included unit test for that case.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mshields822/incubator-beam beam-222

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/248.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #248


commit 2631624ae820df66206d8bdcc2e1d84bedbc696a
Author: Mark Shields 
Date:   2016-04-26T23:34:32Z

Don't advance beyond last valid value




> TestCounterSource does not implement checkpointing contract
> ---
>
> Key: BEAM-222
> URL: https://issues.apache.org/jira/browse/BEAM-222
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Mark Shields
>Assignee: Mark Shields
>
> It will yield repeated values when restored from a checkpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: [BEAM-222] Don't advance beyond last ...

2016-04-26 Thread mshields822
GitHub user mshields822 opened a pull request:

https://github.com/apache/incubator-beam/pull/248

[BEAM-222] Don't advance beyond last valid value

Turns out there was a bug with allowing 'advanced beyond last element' to 
be a valid counter state.
Only an internal reload test caught it.
Included unit test for that case.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mshields822/incubator-beam beam-222

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/248.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #248


commit 2631624ae820df66206d8bdcc2e1d84bedbc696a
Author: Mark Shields 
Date:   2016-04-26T23:34:32Z

Don't advance beyond last valid value




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-beam pull request: Import blacklist

2016-04-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/245


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[3/3] incubator-beam git commit: Closes #245

2016-04-26 Thread dhalperi
Closes #245


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/7bd2d365
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/7bd2d365
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/7bd2d365

Branch: refs/heads/master
Commit: 7bd2d365ae1fe6afb02299637f17487481bc347d
Parents: 8a2a1ce 2b3b140
Author: Dan Halperin 
Authored: Tue Apr 26 16:29:46 2016 -0700
Committer: Dan Halperin 
Committed: Tue Apr 26 16:29:46 2016 -0700

--
 sdks/java/checkstyle.xml   | 4 
 .../src/main/java/org/apache/beam/sdk/io/PubsubClient.java | 6 +++---
 2 files changed, 7 insertions(+), 3 deletions(-)
--




[1/3] incubator-beam git commit: checkstyle.xml: blacklist repackaged code from Google API Client

2016-04-26 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/master 8a2a1ced0 -> 7bd2d365a


checkstyle.xml: blacklist repackaged code from Google API Client


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/8d6d59e1
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/8d6d59e1
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/8d6d59e1

Branch: refs/heads/master
Commit: 8d6d59e1e79696fa036297a8fad92dd6fd66db5e
Parents: 8a2a1ce
Author: Dan Halperin 
Authored: Tue Apr 26 15:49:46 2016 -0700
Committer: Dan Halperin 
Committed: Tue Apr 26 15:49:46 2016 -0700

--
 sdks/java/checkstyle.xml | 4 
 1 file changed, 4 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/8d6d59e1/sdks/java/checkstyle.xml
--
diff --git a/sdks/java/checkstyle.xml b/sdks/java/checkstyle.xml
index 253b822..a4aab6e 100644
--- a/sdks/java/checkstyle.xml
+++ b/sdks/java/checkstyle.xml
@@ -122,6 +122,10 @@ page at http://checkstyle.sourceforge.net/config.html -->
   
 
 
+
+  
+
+
 
   
   



[2/3] incubator-beam git commit: PubsubClient: fix use of blacklisted Preconditions

2016-04-26 Thread dhalperi
PubsubClient: fix use of blacklisted Preconditions


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/2b3b140d
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/2b3b140d
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/2b3b140d

Branch: refs/heads/master
Commit: 2b3b140da0d015d38f556369c199a9fd0565c3b8
Parents: 8d6d59e
Author: Dan Halperin 
Authored: Tue Apr 26 15:50:04 2016 -0700
Committer: Dan Halperin 
Committed: Tue Apr 26 15:50:04 2016 -0700

--
 .../src/main/java/org/apache/beam/sdk/io/PubsubClient.java | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/2b3b140d/sdks/java/core/src/main/java/org/apache/beam/sdk/io/PubsubClient.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/PubsubClient.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/PubsubClient.java
index ce5979b..f92b480 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/PubsubClient.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/PubsubClient.java
@@ -18,7 +18,7 @@
 
 package org.apache.beam.sdk.io;
 
-import com.google.api.client.repackaged.com.google.common.base.Preconditions;
+import static com.google.common.base.Preconditions.checkState;
 
 import java.io.IOException;
 import java.io.Serializable;
@@ -88,7 +88,7 @@ public interface PubsubClient extends AutoCloseable {
 
 public String getV1Beta1Path() {
   String[] splits = path.split("/");
-  Preconditions.checkState(splits.length == 4);
+  checkState(splits.length == 4);
   return String.format("/subscriptions/%s/%s", splits[1], splits[3]);
 }
 
@@ -136,7 +136,7 @@ public interface PubsubClient extends AutoCloseable {
 
 public String getV1Beta1Path() {
   String[] splits = path.split("/");
-  Preconditions.checkState(splits.length == 4);
+  checkState(splits.length == 4);
   return String.format("/topics/%s/%s", splits[1], splits[3]);
 }
 



[jira] [Commented] (BEAM-199) Improve fluent interface for manipulating DisplayData.Items

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15259167#comment-15259167
 ] 

ASF GitHub Bot commented on BEAM-199:
-

GitHub user swegner opened a pull request:

https://github.com/apache/incubator-beam/pull/247

[BEAM-199] Update DisplayData APIs to make overrides more understandable

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/swegner/incubator-beam displaydata-api

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/247.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #247


commit 6a63fc6dd536be49971dc4987cc0d0ab3f887829
Author: Scott Wegner 
Date:   2016-04-25T23:04:38Z

Change API to register DisplayData items to make overrides more 
understandable




> Improve fluent interface for manipulating DisplayData.Items
> ---
>
> Key: BEAM-199
> URL: https://issues.apache.org/jira/browse/BEAM-199
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Minor
>
> See discussion 
> [here|https://github.com/apache/incubator-beam/pull/126#discussion_r59785549].
>  The current fluent API may be difficult to use and could cause some 
> ambiguity. We have some ideas in the linked thread on how to improve it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: [BEAM-199] Update DisplayData APIs to...

2016-04-26 Thread swegner
GitHub user swegner opened a pull request:

https://github.com/apache/incubator-beam/pull/247

[BEAM-199] Update DisplayData APIs to make overrides more understandable

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/swegner/incubator-beam displaydata-api

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/247.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #247


commit 6a63fc6dd536be49971dc4987cc0d0ab3f887829
Author: Scott Wegner 
Date:   2016-04-25T23:04:38Z

Change API to register DisplayData items to make overrides more 
understandable




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-beam pull request: Create build-tools module with consol...

2016-04-26 Thread swegner
GitHub user swegner opened a pull request:

https://github.com/apache/incubator-beam/pull/246

Create build-tools module with consolidated checkstyle config

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This creates a new top-level `build-tools` maven module which holds our 
checkstyle config and can be references from other modules. 

This follows the recommended project structure laid out in [Checkstyle 
Example: Multimodule 
Configuration](https://maven.apache.org/plugins/maven-checkstyle-plugin/examples/multi-module-config.html),
 which will also be useful for incorporating FindBugs in the build.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/swegner/incubator-beam buildtools

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/246.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #246


commit 737e34c782df34a8ab9ec4b50e8e9b1b2e83d13b
Author: Scott Wegner 
Date:   2016-04-26T17:49:26Z

Consolidate checkstyle configuration in new 'build-tools' module

commit eb7db3a81766708a24d3c6cb0d1ce38868281892
Author: Scott Wegner 
Date:   2016-04-26T18:01:16Z

Remove unused imort in Kafka test file

commit ec1296680e009653ede0a341341e0087631d4bc1
Author: Scott Wegner 
Date:   2016-04-26T18:04:04Z

Improve checkstyle header check

commit 1da2055484012523bbcd12536d4f6f7db391caca
Author: Scott Wegner 
Date:   2016-04-26T18:21:51Z

Improve comment validation checks

commit eb52d5835a1f4ad2787e07ee3b91b6f34114a131
Author: Scott Wegner 
Date:   2016-04-26T21:27:45Z

Upgrade checkstyle to latest version




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-beam pull request: Import blacklist

2016-04-26 Thread dhalperi
GitHub user dhalperi opened a pull request:

https://github.com/apache/incubator-beam/pull/245

Import blacklist

Use checkstyle to catch the illegal import of repackaged Guava.

Caught one invalid use case.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/incubator-beam import-blacklist

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/245.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #245


commit 8d6d59e1e79696fa036297a8fad92dd6fd66db5e
Author: Dan Halperin 
Date:   2016-04-26T22:49:46Z

checkstyle.xml: blacklist repackaged code from Google API Client

commit 2b3b140da0d015d38f556369c199a9fd0565c3b8
Author: Dan Halperin 
Date:   2016-04-26T22:50:04Z

PubsubClient: fix use of blacklisted Preconditions




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/3] incubator-beam git commit: Fix multi-line string format

2016-04-26 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/master aa43ec0b0 -> 8a2a1ced0


Fix multi-line string format


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/4a138e94
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/4a138e94
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/4a138e94

Branch: refs/heads/master
Commit: 4a138e943681b3a5a63f034dcb898750fb2fc4d9
Parents: 0d6c53a
Author: Scott Wegner 
Authored: Tue Apr 26 14:07:23 2016 -0700
Committer: Dan Halperin 
Committed: Tue Apr 26 14:42:39 2016 -0700

--
 .../org/apache/beam/sdk/transforms/windowing/StubTrigger.java | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/4a138e94/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/windowing/StubTrigger.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/windowing/StubTrigger.java
 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/windowing/StubTrigger.java
index 738c0bc..06218cf 100644
--- 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/windowing/StubTrigger.java
+++ 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/windowing/StubTrigger.java
@@ -17,7 +17,8 @@
  */
 package org.apache.beam.sdk.transforms.windowing;
 
-import com.google.api.client.util.Lists;
+import com.google.common.collect.Lists;
+
 import org.joda.time.Instant;
 
 import java.util.List;



[2/3] incubator-beam git commit: Reference correct Lists import

2016-04-26 Thread dhalperi
Reference correct Lists import


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/0d6c53ae
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/0d6c53ae
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/0d6c53ae

Branch: refs/heads/master
Commit: 0d6c53aefc6cdbdc560688b8cb6bb454a5bf61c6
Parents: aa43ec0
Author: Scott Wegner 
Authored: Tue Apr 26 14:07:10 2016 -0700
Committer: Dan Halperin 
Committed: Tue Apr 26 14:42:39 2016 -0700

--
 .../sdk/transforms/windowing/AfterProcessingTimeTest.java| 8 
 1 file changed, 4 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/0d6c53ae/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/windowing/AfterProcessingTimeTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/windowing/AfterProcessingTimeTest.java
 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/windowing/AfterProcessingTimeTest.java
index 8178d54..8d2b4a1 100644
--- 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/windowing/AfterProcessingTimeTest.java
+++ 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/windowing/AfterProcessingTimeTest.java
@@ -179,10 +179,10 @@ public class AfterProcessingTimeTest {
 .plusDelayOf(Duration.standardMinutes(10)))
 .buildTrigger();
 
-String expected = "AfterWatermark.pastEndOfWindow()" +
-".withLateFirings(AfterProcessingTime" +
-  ".pastFirstElementInPane()" +
-  ".plusDelayOf(10 minutes))";
+String expected = "AfterWatermark.pastEndOfWindow()"
++ ".withLateFirings(AfterProcessingTime"
++ ".pastFirstElementInPane()"
++ ".plusDelayOf(10 minutes))";
 
 assertEquals(expected, trigger.toString());
   }



[GitHub] incubator-beam pull request: Minor fixes

2016-04-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/244


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[3/3] incubator-beam git commit: Closes #244

2016-04-26 Thread dhalperi
Closes #244


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/8a2a1ced
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/8a2a1ced
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/8a2a1ced

Branch: refs/heads/master
Commit: 8a2a1ced03a6ca9304f48de23d5769e72feff36c
Parents: aa43ec0 4a138e9
Author: Dan Halperin 
Authored: Tue Apr 26 14:42:40 2016 -0700
Committer: Dan Halperin 
Committed: Tue Apr 26 14:42:40 2016 -0700

--
 .../sdk/transforms/windowing/AfterProcessingTimeTest.java| 8 
 .../apache/beam/sdk/transforms/windowing/StubTrigger.java| 3 ++-
 2 files changed, 6 insertions(+), 5 deletions(-)
--




[jira] [Commented] (BEAM-77) Reorganize Directory structure

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-77?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15258953#comment-15258953
 ] 

ASF GitHub Bot commented on BEAM-77:


Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/97


> Reorganize Directory structure
> --
>
> Key: BEAM-77
> URL: https://issues.apache.org/jira/browse/BEAM-77
> Project: Beam
>  Issue Type: Task
>  Components: project-management
>Reporter: Frances Perry
>Assignee: Jean-Baptiste Onofré
>
> Now that we've done the initial Dataflow code drop, we will restructure 
> directories to provide space for additional SDKs and Runners.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/3] incubator-beam git commit: [BEAM-77] Move bigtable in IO

2016-04-26 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/master 9746f0d1d -> aa43ec0b0


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/d138ae54/sdks/java/io/google-cloud-platform/pom.xml
--
diff --git a/sdks/java/io/google-cloud-platform/pom.xml 
b/sdks/java/io/google-cloud-platform/pom.xml
new file mode 100644
index 000..d1d5cd6
--- /dev/null
+++ b/sdks/java/io/google-cloud-platform/pom.xml
@@ -0,0 +1,96 @@
+
+
+http://maven.apache.org/POM/4.0.0";
+ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd";>
+
+  4.0.0
+
+  
+org.apache.beam
+io-parent
+0.1.0-incubating-SNAPSHOT
+../pom.xml
+  
+
+  google-cloud-platform
+  Apache Beam :: SDKs :: Java :: IO :: Google Cloud Platform
+  IO library to read and write Google Cloud Platform systems from 
Beam.
+  jar
+
+  
+0.2.3
+  
+
+  
+
+  org.apache.beam
+  java-sdk-all
+  ${project.version}
+
+
+
+  io.grpc
+  grpc-all
+  0.12.0
+
+
+
+  com.google.cloud.bigtable
+  bigtable-protos
+  ${bigtable.version}
+
+
+
+  com.google.cloud.bigtable
+  bigtable-client-core
+  ${bigtable.version}
+
+
+
+
+  org.apache.beam
+  java-sdk-all
+  ${project.version}
+  tests
+  test
+
+
+
+  org.hamcrest
+  hamcrest-all
+  1.3
+  test
+
+
+
+  junit
+  junit
+  4.11
+  test
+
+
+
+  org.slf4j
+  slf4j-jdk14
+  1.7.14
+  test
+
+  
+
+
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/d138ae54/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
new file mode 100644
index 000..f99eb1d
--- /dev/null
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
@@ -0,0 +1,1016 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.bigtable;
+
+import static com.google.common.base.Preconditions.checkArgument;
+import static com.google.common.base.Preconditions.checkNotNull;
+import static com.google.common.base.Preconditions.checkState;
+
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.VarLongCoder;
+import org.apache.beam.sdk.coders.protobuf.ProtoCoder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.io.BoundedSource.BoundedReader;
+import org.apache.beam.sdk.io.Sink.WriteOperation;
+import org.apache.beam.sdk.io.Sink.Writer;
+import org.apache.beam.sdk.io.range.ByteKey;
+import org.apache.beam.sdk.io.range.ByteKeyRange;
+import org.apache.beam.sdk.io.range.ByteKeyRangeTracker;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.runners.PipelineRunner;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.util.ReleaseInfo;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PDone;
+
+import com.google.bigtable.v1.Mutation;
+import com.google.bigtable.v1.Row;
+import com.google.bigtable.v1.RowFilter;
+import com.google.bigtable.v1.SampleRowKeysResponse;
+import com.google.cloud.bigtable.config.BigtableOptions;
+import com.google.common.base.MoreObjects;
+import com.google.common.collect.ImmutableList;
+import com.google.common.util.concurrent.FutureCallback;
+import com.google.common.util.concurrent.Futures;
+import com.google.protobuf.ByteString;
+import com.google.protobuf.Empty;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactor

[2/3] incubator-beam git commit: [BEAM-77] Move bigtable in IO

2016-04-26 Thread dhalperi
[BEAM-77] Move bigtable in IO


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/d138ae54
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/d138ae54
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/d138ae54

Branch: refs/heads/master
Commit: d138ae5445315ced3e0a198ab1e99773a5ebfee0
Parents: 9746f0d
Author: Jean-Baptiste Onofré 
Authored: Wed Apr 20 07:40:49 2016 +0200
Committer: Dan Halperin 
Committed: Tue Apr 26 14:08:54 2016 -0700

--
 pom.xml |1 -
 sdks/java/core/pom.xml  |   25 -
 .../apache/beam/sdk/io/bigtable/BigtableIO.java | 1016 --
 .../beam/sdk/io/bigtable/BigtableService.java   |  111 --
 .../sdk/io/bigtable/BigtableServiceImpl.java|  244 -
 .../beam/sdk/io/bigtable/package-info.java  |   23 -
 .../beam/sdk/io/bigtable/BigtableIOTest.java|  729 -
 sdks/java/io/google-cloud-platform/pom.xml  |   96 ++
 .../beam/sdk/io/gcp/bigtable/BigtableIO.java| 1016 ++
 .../sdk/io/gcp/bigtable/BigtableService.java|  111 ++
 .../io/gcp/bigtable/BigtableServiceImpl.java|  244 +
 .../beam/sdk/io/gcp/bigtable/package-info.java  |   23 +
 .../sdk/io/gcp/bigtable/BigtableIOTest.java |  729 +
 sdks/java/io/pom.xml|1 +
 14 files changed, 2220 insertions(+), 2149 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/d138ae54/pom.xml
--
diff --git a/pom.xml b/pom.xml
index 03f0fb2..1bd18ec 100644
--- a/pom.xml
+++ b/pom.xml
@@ -102,7 +102,6 @@
 
 1.7.7
 v2-rev248-1.21.0
-0.2.3
 0.0.2
 v2-rev6-1.21.0
 v1b3-rev22-1.21.0

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/d138ae54/sdks/java/core/pom.xml
--
diff --git a/sdks/java/core/pom.xml b/sdks/java/core/pom.xml
index c634e9c..07d2fce 100644
--- a/sdks/java/core/pom.xml
+++ b/sdks/java/core/pom.xml
@@ -227,7 +227,6 @@
   true
   
 
-  
com.google.cloud.bigtable:bigtable-client-core
   com.google.guava:guava
 
   
@@ -253,17 +252,6 @@
   com.google.thirdparty
   
org.apache.beam.sdk.repackaged.com.google.thirdparty
 
-
-  com.google.cloud.bigtable
-  
org.apache.beam.sdk.repackaged.com.google.cloud.bigtable
-  
-
com.google.cloud.bigtable.config.BigtableOptions*
-
com.google.cloud.bigtable.config.CredentialOptions*
-
com.google.cloud.bigtable.config.RetryOptions*
-
com.google.cloud.bigtable.grpc.BigtableClusterName
-
com.google.cloud.bigtable.grpc.BigtableTableName
-  
-
   
 
   
@@ -281,7 +269,6 @@
   
${project.artifactId}-bundled-${project.version}
   
 
-  
com.google.cloud.bigtable:bigtable-client-core
   com.google.guava:guava
 
   
@@ -394,18 +381,6 @@
 
 
 
-  com.google.cloud.bigtable
-  bigtable-protos
-  ${bigtable.version}
-
-
-
-  com.google.cloud.bigtable
-  bigtable-client-core
-  ${bigtable.version}
-
-
-
   com.google.api-client
   google-api-client
   ${google-clients.version}

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/d138ae54/sdks/java/core/src/main/java/org/apache/beam/sdk/io/bigtable/BigtableIO.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/bigtable/BigtableIO.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/bigtable/BigtableIO.java
deleted file mode 100644
index 28ff335..000
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/bigtable/BigtableIO.java
+++ /dev/null
@@ -1,1016 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed u

[3/3] incubator-beam git commit: Closes #97

2016-04-26 Thread dhalperi
Closes #97


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/aa43ec0b
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/aa43ec0b
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/aa43ec0b

Branch: refs/heads/master
Commit: aa43ec0b041b21a49601efcb3699456b52064f69
Parents: 9746f0d d138ae5
Author: Dan Halperin 
Authored: Tue Apr 26 14:08:55 2016 -0700
Committer: Dan Halperin 
Committed: Tue Apr 26 14:08:55 2016 -0700

--
 pom.xml |1 -
 sdks/java/core/pom.xml  |   25 -
 .../apache/beam/sdk/io/bigtable/BigtableIO.java | 1016 --
 .../beam/sdk/io/bigtable/BigtableService.java   |  111 --
 .../sdk/io/bigtable/BigtableServiceImpl.java|  244 -
 .../beam/sdk/io/bigtable/package-info.java  |   23 -
 .../beam/sdk/io/bigtable/BigtableIOTest.java|  729 -
 sdks/java/io/google-cloud-platform/pom.xml  |   96 ++
 .../beam/sdk/io/gcp/bigtable/BigtableIO.java| 1016 ++
 .../sdk/io/gcp/bigtable/BigtableService.java|  111 ++
 .../io/gcp/bigtable/BigtableServiceImpl.java|  244 +
 .../beam/sdk/io/gcp/bigtable/package-info.java  |   23 +
 .../sdk/io/gcp/bigtable/BigtableIOTest.java |  729 +
 sdks/java/io/pom.xml|1 +
 14 files changed, 2220 insertions(+), 2149 deletions(-)
--




[GitHub] incubator-beam pull request: [BEAM-77] Create IO module and move b...

2016-04-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/97


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-beam pull request: Minor fixes

2016-04-26 Thread swegner
GitHub user swegner opened a pull request:

https://github.com/apache/incubator-beam/pull/244

Minor fixes

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/swegner/incubator-beam minor-fixes

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/244.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #244


commit dccada7b6d3a948518339af71e3e8fc4819584a8
Author: Scott Wegner 
Date:   2016-04-26T21:07:10Z

Reference correct Lists import

commit 236bce7c536692580d93bcc6e45a6dc37f4f9b0b
Author: Scott Wegner 
Date:   2016-04-26T21:07:23Z

Fix multi-line string format




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-189) The Spark runner uses valueInEmptyWindow which causes values to be dropped

2016-04-26 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15258880#comment-15258880
 ] 

Kenneth Knowles commented on BEAM-189:
--

Follow-up to remove the problematic method entirely is BEAM-230.

> The Spark runner uses valueInEmptyWindow which causes values to be dropped
> --
>
> Key: BEAM-189
> URL: https://issues.apache.org/jira/browse/BEAM-189
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Amit Sela
>Assignee: Amit Sela
>
> Values in empty windowed may be dropped at anytime and so the default 
> windowing should be with GlobalWindow
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-22) DirectPipelineRunner: support for unbounded collections

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15258853#comment-15258853
 ] 

ASF GitHub Bot commented on BEAM-22:


Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/242


> DirectPipelineRunner: support for unbounded collections
> ---
>
> Key: BEAM-22
> URL: https://issues.apache.org/jira/browse/BEAM-22
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Davor Bonaci
>Assignee: Thomas Groh
>
> DirectPipelineRunner currently runs over bounded PCollections only, and 
> implements only a portion of the Beam Model.
> We should improve it to faithfully implement the full Beam Model, such as add 
> ability to run over unbounded PCollections, and better resemble execution 
> model in a distributed system.
> This further enables features such as a testing source which may simulate 
> late data and test triggers in the pipeline. Finally, we may want to expose 
> an option to select between "debug" (single threaded), "chaos monkey" (test 
> as many model requirements as possible), and "performance" (multi-threaded).
> more testing (chaos monkey) 
> Once this is done, we should update this StackOverflow question:
> http://stackoverflow.com/questions/35350113/testing-triggers-with-processing-time/35401426#35401426



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: [BEAM-22] Output InProcessGroupByKeyO...

2016-04-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/242


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] incubator-beam git commit: This closes #242

2016-04-26 Thread kenn
This closes #242


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/9746f0d1
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/9746f0d1
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/9746f0d1

Branch: refs/heads/master
Commit: 9746f0d1d78dd7ac9c37d414347eb95b528b4d47
Parents: 7dc1a40 a9e0f01
Author: Kenneth Knowles 
Authored: Tue Apr 26 13:26:49 2016 -0700
Committer: Kenneth Knowles 
Committed: Tue Apr 26 13:26:49 2016 -0700

--
 .../beam/sdk/runners/inprocess/GroupByKeyEvaluatorFactory.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--




[1/2] incubator-beam git commit: Output InProcessGroupByKeyOnly elements in the Global Window

2016-04-26 Thread kenn
Repository: incubator-beam
Updated Branches:
  refs/heads/master 7dc1a4047 -> 9746f0d1d


Output InProcessGroupByKeyOnly elements in the Global Window

This ensures that the values are not dropped if they are exploded (for
example, in the Watermark Manager).


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/a9e0f017
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/a9e0f017
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/a9e0f017

Branch: refs/heads/master
Commit: a9e0f01748e00124d4851106cfa85da386f4531e
Parents: 7dc1a40
Author: Thomas Groh 
Authored: Tue Apr 26 11:46:26 2016 -0700
Committer: Thomas Groh 
Committed: Tue Apr 26 11:46:26 2016 -0700

--
 .../beam/sdk/runners/inprocess/GroupByKeyEvaluatorFactory.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/a9e0f017/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/GroupByKeyEvaluatorFactory.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/GroupByKeyEvaluatorFactory.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/GroupByKeyEvaluatorFactory.java
index ec0af8d..4cec841 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/GroupByKeyEvaluatorFactory.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/GroupByKeyEvaluatorFactory.java
@@ -149,7 +149,7 @@ class GroupByKeyEvaluatorFactory implements 
TransformEvaluatorFactory {
 KeyedWorkItems.elementsWorkItem(key, groupedEntry.getValue());
 UncommittedBundle> bundle =
 evaluationContext.createKeyedBundle(inputBundle, key, 
application.getOutput());
-bundle.add(WindowedValue.valueInEmptyWindows(groupedKv));
+bundle.add(WindowedValue.valueInGlobalWindow(groupedKv));
 resultBuilder.addOutput(bundle);
   }
   return resultBuilder.build();



[jira] [Created] (BEAM-230) Remove WindowedValue#valueInEmptyWindows

2016-04-26 Thread Thomas Groh (JIRA)
Thomas Groh created BEAM-230:


 Summary: Remove WindowedValue#valueInEmptyWindows
 Key: BEAM-230
 URL: https://issues.apache.org/jira/browse/BEAM-230
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Reporter: Thomas Groh
Assignee: Davor Bonaci


A WindowedValue in no windows does not exist, and can be dropped by a runner at 
any time.

We should also assert that any collection of windows is nonempty when creating 
a new WindowedValue. If a user wants to drop an element, they should explicitly 
filter it out rather than expecting it to be dropped by the runner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-22) DirectPipelineRunner: support for unbounded collections

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15258682#comment-15258682
 ] 

ASF GitHub Bot commented on BEAM-22:


GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/242

[BEAM-22] Output InProcessGroupByKeyOnly elements in the Global Window

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This ensures that the values are not dropped if they are exploded (for
example, in the Watermark Manager).

Similar to 
[BEAM-189](https://issues.apache.org/jira/browse/BEAM-189?jql=project%20%3D%20BEAM%20AND%20text%20~%20%22empty%22)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam ippr_gbk_global_window

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/242.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #242


commit a9e0f01748e00124d4851106cfa85da386f4531e
Author: Thomas Groh 
Date:   2016-04-26T18:46:26Z

Output InProcessGroupByKeyOnly elements in the Global Window

This ensures that the values are not dropped if they are exploded (for
example, in the Watermark Manager).




> DirectPipelineRunner: support for unbounded collections
> ---
>
> Key: BEAM-22
> URL: https://issues.apache.org/jira/browse/BEAM-22
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Davor Bonaci
>Assignee: Thomas Groh
>
> DirectPipelineRunner currently runs over bounded PCollections only, and 
> implements only a portion of the Beam Model.
> We should improve it to faithfully implement the full Beam Model, such as add 
> ability to run over unbounded PCollections, and better resemble execution 
> model in a distributed system.
> This further enables features such as a testing source which may simulate 
> late data and test triggers in the pipeline. Finally, we may want to expose 
> an option to select between "debug" (single threaded), "chaos monkey" (test 
> as many model requirements as possible), and "performance" (multi-threaded).
> more testing (chaos monkey) 
> Once this is done, we should update this StackOverflow question:
> http://stackoverflow.com/questions/35350113/testing-triggers-with-processing-time/35401426#35401426



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-151) Create Dataflow Runner Package

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15258683#comment-15258683
 ] 

ASF GitHub Bot commented on BEAM-151:
-

GitHub user lukecwik opened a pull request:

https://github.com/apache/incubator-beam/pull/243

[BEAM-151] Move over some more Dataflow specific classes.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

Note that users should use proto ByteString instead of RandomAccessData
since it provides a safer version of the same functionality.

I hoped that I would be able to move over more of the *Cloud* classes
and their helpers but they are embedded part of coders. Nothing more
can be done here until there is an official Beam representation of a coder
decoupled from Dataflow CloudKnownTypes.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lukecwik/incubator-beam beam151

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/243.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #243


commit 305c002ecbbe031458dcc2da66b4a2a0b5f41174
Author: Luke Cwik 
Date:   2016-04-26T18:43:26Z

[BEAM-151] Move over some more Dataflow specific classes.

Note that users should use proto ByteString instead of RandomAccessData
since it provides a safer version of the same functionality.

I hoped that I would be able to move over more of the *Cloud* classes
and their helpers but they are embedded part of coders. Nothing more
can be done here until there is an official Beam representation of a coder
decoupled from Dataflow CloudKnownTypes.




> Create Dataflow Runner Package
> --
>
> Key: BEAM-151
> URL: https://issues.apache.org/jira/browse/BEAM-151
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>
> Move Dataflow runner out of SDK core and into new Dataflow runner maven 
> module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: [BEAM-22] Output InProcessGroupByKeyO...

2016-04-26 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/242

[BEAM-22] Output InProcessGroupByKeyOnly elements in the Global Window

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This ensures that the values are not dropped if they are exploded (for
example, in the Watermark Manager).

Similar to 
[BEAM-189](https://issues.apache.org/jira/browse/BEAM-189?jql=project%20%3D%20BEAM%20AND%20text%20~%20%22empty%22)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam ippr_gbk_global_window

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/242.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #242


commit a9e0f01748e00124d4851106cfa85da386f4531e
Author: Thomas Groh 
Date:   2016-04-26T18:46:26Z

Output InProcessGroupByKeyOnly elements in the Global Window

This ensures that the values are not dropped if they are exploded (for
example, in the Watermark Manager).




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-beam pull request: [BEAM-151] Move over some more Datafl...

2016-04-26 Thread lukecwik
GitHub user lukecwik opened a pull request:

https://github.com/apache/incubator-beam/pull/243

[BEAM-151] Move over some more Dataflow specific classes.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

Note that users should use proto ByteString instead of RandomAccessData
since it provides a safer version of the same functionality.

I hoped that I would be able to move over more of the *Cloud* classes
and their helpers but they are embedded part of coders. Nothing more
can be done here until there is an official Beam representation of a coder
decoupled from Dataflow CloudKnownTypes.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lukecwik/incubator-beam beam151

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/243.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #243


commit 305c002ecbbe031458dcc2da66b4a2a0b5f41174
Author: Luke Cwik 
Date:   2016-04-26T18:43:26Z

[BEAM-151] Move over some more Dataflow specific classes.

Note that users should use proto ByteString instead of RandomAccessData
since it provides a safer version of the same functionality.

I hoped that I would be able to move over more of the *Cloud* classes
and their helpers but they are embedded part of coders. Nothing more
can be done here until there is an official Beam representation of a coder
decoupled from Dataflow CloudKnownTypes.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-beam pull request: [BEAM-117] Minor updates for DisplayD...

2016-04-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/241


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-117) Implement the API for Static Display Metadata

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15258481#comment-15258481
 ] 

ASF GitHub Bot commented on BEAM-117:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/241


> Implement the API for Static Display Metadata
> -
>
> Key: BEAM-117
> URL: https://issues.apache.org/jira/browse/BEAM-117
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Ben Chambers
>Assignee: Scott Wegner
>
> As described in the following doc, we would like the SDK to allow associating 
> display metadata with PTransforms.
> https://docs.google.com/document/d/11enEB9JwVp6vO0uOYYTMYTGkr3TdNfELwWqoiUg5ZxM/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[4/4] incubator-beam git commit: This closes #241

2016-04-26 Thread bchambers
This closes #241


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/7dc1a404
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/7dc1a404
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/7dc1a404

Branch: refs/heads/master
Commit: 7dc1a4047c8a145377c26ed73163a28f7b8835dd
Parents: d299e2c c6690c1
Author: bchambers 
Authored: Tue Apr 26 09:39:06 2016 -0700
Committer: bchambers 
Committed: Tue Apr 26 09:39:06 2016 -0700

--
 .../org/apache/beam/sdk/io/DatastoreIO.java |  1 +
 .../main/java/org/apache/beam/sdk/io/Read.java  |  2 +
 .../main/java/org/apache/beam/sdk/io/Sink.java  |  2 +-
 .../java/org/apache/beam/sdk/io/Source.java |  2 +-
 .../main/java/org/apache/beam/sdk/io/Write.java |  1 +
 .../sdk/transforms/ApproximateQuantiles.java|  1 +
 .../beam/sdk/transforms/ApproximateUnique.java  |  2 +
 .../org/apache/beam/sdk/transforms/Combine.java | 11 
 .../beam/sdk/transforms/CombineFnBase.java  |  4 +-
 .../apache/beam/sdk/transforms/CombineFns.java  |  4 ++
 .../beam/sdk/transforms/CombineWithContext.java |  1 +
 .../org/apache/beam/sdk/transforms/DoFn.java|  2 +-
 .../beam/sdk/transforms/DoFnWithContext.java|  2 +-
 .../org/apache/beam/sdk/transforms/Filter.java  |  4 ++
 .../apache/beam/sdk/transforms/GroupByKey.java  |  1 +
 .../transforms/IntraBundleParallelization.java  |  1 +
 .../org/apache/beam/sdk/transforms/Max.java |  1 +
 .../org/apache/beam/sdk/transforms/Min.java |  1 +
 .../apache/beam/sdk/transforms/PTransform.java  |  2 +-
 .../org/apache/beam/sdk/transforms/ParDo.java   |  4 +-
 .../apache/beam/sdk/transforms/Partition.java   |  2 +
 .../org/apache/beam/sdk/transforms/Sample.java  |  2 +
 .../org/apache/beam/sdk/transforms/Top.java |  1 +
 .../sdk/transforms/display/ClassForDisplay.java |  4 +-
 .../sdk/transforms/display/DisplayData.java | 58 ++--
 .../sdk/transforms/display/HasDisplayData.java  |  8 +--
 .../transforms/windowing/CalendarWindows.java   |  6 ++
 .../sdk/transforms/windowing/FixedWindows.java  |  1 +
 .../beam/sdk/transforms/windowing/Sessions.java |  1 +
 .../transforms/windowing/SlidingWindows.java|  1 +
 .../beam/sdk/transforms/windowing/Window.java   |  1 +
 .../beam/sdk/transforms/windowing/WindowFn.java |  2 +-
 .../org/apache/beam/sdk/util/CombineFnUtil.java |  1 +
 .../io/BoundedReadFromUnboundedSourceTest.java  |  5 +-
 .../beam/sdk/io/CompressedSourceTest.java   |  5 +-
 .../java/org/apache/beam/sdk/io/ReadTest.java   |  6 +-
 .../java/org/apache/beam/sdk/io/WriteTest.java  |  5 +-
 .../beam/sdk/transforms/CombineFnsTest.java |  6 +-
 .../apache/beam/sdk/transforms/CombineTest.java |  4 +-
 .../IntraBundleParallelizationTest.java |  4 +-
 .../apache/beam/sdk/transforms/ParDoTest.java   |  8 +--
 .../transforms/display/DisplayDataMatchers.java | 14 ++---
 .../display/DisplayDataMatchersTest.java|  4 +-
 .../sdk/transforms/display/DisplayDataTest.java |  6 +-
 .../sdk/transforms/windowing/WindowTest.java|  4 +-
 45 files changed, 130 insertions(+), 78 deletions(-)
--




[3/4] incubator-beam git commit: Rename DisplayDataMatchers.includes to includesDisplayData to clarify usage

2016-04-26 Thread bchambers
Rename DisplayDataMatchers.includes to includesDisplayData to clarify usage


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/582befdc
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/582befdc
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/582befdc

Branch: refs/heads/master
Commit: 582befdc8aed6dc62bf1d205749d7f46d7894f97
Parents: d299e2c
Author: Scott Wegner 
Authored: Mon Apr 25 09:31:13 2016 -0700
Committer: bchambers 
Committed: Tue Apr 26 09:39:01 2016 -0700

--
 .../sdk/io/BoundedReadFromUnboundedSourceTest.java|  5 +++--
 .../org/apache/beam/sdk/io/CompressedSourceTest.java  |  5 +++--
 .../test/java/org/apache/beam/sdk/io/ReadTest.java|  6 +++---
 .../test/java/org/apache/beam/sdk/io/WriteTest.java   |  5 +++--
 .../apache/beam/sdk/transforms/CombineFnsTest.java|  6 +++---
 .../org/apache/beam/sdk/transforms/CombineTest.java   |  4 ++--
 .../transforms/IntraBundleParallelizationTest.java|  4 ++--
 .../org/apache/beam/sdk/transforms/ParDoTest.java |  8 
 .../sdk/transforms/display/DisplayDataMatchers.java   | 14 +++---
 .../transforms/display/DisplayDataMatchersTest.java   |  4 ++--
 .../beam/sdk/transforms/display/DisplayDataTest.java  |  6 +++---
 .../beam/sdk/transforms/windowing/WindowTest.java |  4 ++--
 12 files changed, 37 insertions(+), 34 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/582befdc/sdks/java/core/src/test/java/org/apache/beam/sdk/io/BoundedReadFromUnboundedSourceTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/BoundedReadFromUnboundedSourceTest.java
 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/BoundedReadFromUnboundedSourceTest.java
index 30f3eca..f7949e7 100644
--- 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/BoundedReadFromUnboundedSourceTest.java
+++ 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/BoundedReadFromUnboundedSourceTest.java
@@ -17,7 +17,8 @@
  */
 package org.apache.beam.sdk.io;
 
-import static 
org.apache.beam.sdk.transforms.display.DisplayDataMatchers.includes;
+import static 
org.apache.beam.sdk.transforms.display.DisplayDataMatchers.includesDisplayDataFrom;
+
 import static org.hamcrest.Matchers.containsInAnyOrder;
 import static org.junit.Assert.assertEquals;
 import static org.junit.Assert.assertThat;
@@ -79,7 +80,7 @@ public class BoundedReadFromUnboundedSourceTest implements 
Serializable{
 };
 
 BoundedReadFromUnboundedSource> read = 
Read.from(src).withMaxNumRecords(5);
-assertThat(DisplayData.from(read), includes(src));
+assertThat(DisplayData.from(read), includesDisplayDataFrom(src));
   }
 
   private static class Checker

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/582befdc/sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
index 4ad8904..36ef4b3 100644
--- 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
+++ 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/CompressedSourceTest.java
@@ -19,7 +19,8 @@ package org.apache.beam.sdk.io;
 
 import static 
org.apache.beam.sdk.transforms.display.DisplayDataMatchers.hasDisplayItem;
 import static 
org.apache.beam.sdk.transforms.display.DisplayDataMatchers.hasKey;
-import static 
org.apache.beam.sdk.transforms.display.DisplayDataMatchers.includes;
+import static 
org.apache.beam.sdk.transforms.display.DisplayDataMatchers.includesDisplayDataFrom;
+
 import static org.hamcrest.Matchers.containsString;
 import static org.hamcrest.Matchers.instanceOf;
 import static org.junit.Assert.assertFalse;
@@ -355,7 +356,7 @@ public class CompressedSourceTest {
 assertThat(compressedSourceDisplayData, 
hasDisplayItem(hasKey("compressionMode")));
 assertThat(gzipDisplayData, hasDisplayItem("compressionMode", 
CompressionMode.GZIP.toString()));
 assertThat(compressedSourceDisplayData, hasDisplayItem("source", 
inputSource.getClass()));
-assertThat(compressedSourceDisplayData, includes(inputSource));
+assertThat(compressedSourceDisplayData, 
includesDisplayDataFrom(inputSource));
   }
 
   /**

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/582befdc/sdks/java/core/src/test/java/org/apache/beam/sdk/io/ReadTest.java
--
diff --git a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/ReadTest.java 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/i

[2/4] incubator-beam git commit: Normalize comments to refer to 'display data' rather than 'display metadata'

2016-04-26 Thread bchambers
Normalize comments to refer to 'display data' rather than 'display metadata'


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/9d45a4a1
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/9d45a4a1
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/9d45a4a1

Branch: refs/heads/master
Commit: 9d45a4a11dcf66434c3d434d31900487471dcbe3
Parents: 582befd
Author: Scott Wegner 
Authored: Mon Apr 25 09:34:58 2016 -0700
Committer: bchambers 
Committed: Tue Apr 26 09:39:01 2016 -0700

--
 .../main/java/org/apache/beam/sdk/io/Sink.java  |  2 +-
 .../java/org/apache/beam/sdk/io/Source.java |  2 +-
 .../beam/sdk/transforms/CombineFnBase.java  |  4 +-
 .../org/apache/beam/sdk/transforms/DoFn.java|  2 +-
 .../beam/sdk/transforms/DoFnWithContext.java|  2 +-
 .../apache/beam/sdk/transforms/PTransform.java  |  2 +-
 .../org/apache/beam/sdk/transforms/ParDo.java   |  2 +-
 .../sdk/transforms/display/ClassForDisplay.java |  4 +-
 .../sdk/transforms/display/DisplayData.java | 58 ++--
 .../sdk/transforms/display/HasDisplayData.java  |  8 +--
 .../beam/sdk/transforms/windowing/WindowFn.java |  2 +-
 11 files changed, 44 insertions(+), 44 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/9d45a4a1/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Sink.java
--
diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Sink.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Sink.java
index 8b6b637..20b1631 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Sink.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Sink.java
@@ -139,7 +139,7 @@ public abstract class Sink implements Serializable, 
HasDisplayData {
* {@inheritDoc}
*
* By default, does not register any display data. Implementors may 
override this method
-   * to provide their own display metadata.
+   * to provide their own display data.
*/
   @Override
   public void populateDisplayData(DisplayData.Builder builder) {}

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/9d45a4a1/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Source.java
--
diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Source.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Source.java
index 2ab0d4e..b8902f9 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Source.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Source.java
@@ -72,7 +72,7 @@ public abstract class Source implements Serializable, 
HasDisplayData {
* {@inheritDoc}
*
* By default, does not register any display data. Implementors may 
override this method
-   * to provide their own display metadata.
+   * to provide their own display data.
*/
   @Override
   public void populateDisplayData(DisplayData.Builder builder) {}

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/9d45a4a1/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/CombineFnBase.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/CombineFnBase.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/CombineFnBase.java
index 1b64bb2..c73ba54 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/CombineFnBase.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/CombineFnBase.java
@@ -225,7 +225,7 @@ public class CombineFnBase {
  * {@inheritDoc}
  *
  * By default, does not register any display data. Implementors may 
override this method
- * to provide their own display metadata.
+ * to provide their own display data.
  */
 @Override
 public void populateDisplayData(DisplayData.Builder builder) {
@@ -300,7 +300,7 @@ public class CombineFnBase {
  * {@inheritDoc}
  *
  * By default, does not register any display data. Implementors may 
override this method
- * to provide their own display metadata.
+ * to provide their own display data.
  */
 @Override
 public void populateDisplayData(DisplayData.Builder builder) {

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/9d45a4a1/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
index 04f272d..6d5d1ed 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/D

[1/4] incubator-beam git commit: Add super.populateDisplayData() to standard implementations. The current super implementation is a no-op, but this is the recommended way to implement the pattern

2016-04-26 Thread bchambers
Repository: incubator-beam
Updated Branches:
  refs/heads/master d299e2c25 -> 7dc1a4047


Add super.populateDisplayData() to standard implementations. The current super 
implementation is a no-op, but this is the recommended way to implement the 
pattern


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/c6690c18
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/c6690c18
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/c6690c18

Branch: refs/heads/master
Commit: c6690c18a64ec6c7971712ab43807b13c1849571
Parents: 9d45a4a
Author: Scott Wegner 
Authored: Mon Apr 25 09:41:07 2016 -0700
Committer: bchambers 
Committed: Tue Apr 26 09:39:01 2016 -0700

--
 .../main/java/org/apache/beam/sdk/io/DatastoreIO.java|  1 +
 .../core/src/main/java/org/apache/beam/sdk/io/Read.java  |  2 ++
 .../core/src/main/java/org/apache/beam/sdk/io/Write.java |  1 +
 .../apache/beam/sdk/transforms/ApproximateQuantiles.java |  1 +
 .../apache/beam/sdk/transforms/ApproximateUnique.java|  2 ++
 .../java/org/apache/beam/sdk/transforms/Combine.java | 11 +++
 .../java/org/apache/beam/sdk/transforms/CombineFns.java  |  4 
 .../apache/beam/sdk/transforms/CombineWithContext.java   |  1 +
 .../main/java/org/apache/beam/sdk/transforms/Filter.java |  4 
 .../java/org/apache/beam/sdk/transforms/GroupByKey.java  |  1 +
 .../beam/sdk/transforms/IntraBundleParallelization.java  |  1 +
 .../main/java/org/apache/beam/sdk/transforms/Max.java|  1 +
 .../main/java/org/apache/beam/sdk/transforms/Min.java|  1 +
 .../main/java/org/apache/beam/sdk/transforms/ParDo.java  |  2 ++
 .../java/org/apache/beam/sdk/transforms/Partition.java   |  2 ++
 .../main/java/org/apache/beam/sdk/transforms/Sample.java |  2 ++
 .../main/java/org/apache/beam/sdk/transforms/Top.java|  1 +
 .../beam/sdk/transforms/windowing/CalendarWindows.java   |  6 ++
 .../beam/sdk/transforms/windowing/FixedWindows.java  |  1 +
 .../apache/beam/sdk/transforms/windowing/Sessions.java   |  1 +
 .../beam/sdk/transforms/windowing/SlidingWindows.java|  1 +
 .../org/apache/beam/sdk/transforms/windowing/Window.java |  1 +
 .../java/org/apache/beam/sdk/util/CombineFnUtil.java |  1 +
 23 files changed, 49 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/c6690c18/sdks/java/core/src/main/java/org/apache/beam/sdk/io/DatastoreIO.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/DatastoreIO.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/DatastoreIO.java
index 4348950..c265659 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/DatastoreIO.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/DatastoreIO.java
@@ -604,6 +604,7 @@ public class DatastoreIO {
 
 @Override
 public void populateDisplayData(DisplayData.Builder builder) {
+  super.populateDisplayData(builder);
   builder
   .addIfNotDefault("host", host, DEFAULT_HOST)
   .addIfNotNull("dataset", datasetId);

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/c6690c18/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Read.java
--
diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Read.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Read.java
index 1f41e5c..05b70ac 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Read.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Read.java
@@ -147,6 +147,7 @@ public class Read {
 
 @Override
 public void populateDisplayData(DisplayData.Builder builder) {
+  super.populateDisplayData(builder);
   builder
   .add("source", source.getClass())
   .include(source);
@@ -261,6 +262,7 @@ public class Read {
 
 @Override
 public void populateDisplayData(DisplayData.Builder builder) {
+  super.populateDisplayData(builder);
   builder
   .add("source", source.getClass())
   .include(source);

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/c6690c18/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Write.java
--
diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Write.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Write.java
index b8fa259..a7d182d 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Write.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Write.java
@@ -79,6 +79,7 @@ public class Write {
 
 @Override
 public void populateDisplayData(DisplayData.Builder builder) {
+  super.populateDisplayDa

[jira] [Commented] (BEAM-168) IntervalBoundedExponentialBackOff change broke Beam-on-Dataflow

2016-04-26 Thread Daniel Halperin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15258369#comment-15258369
 ] 

Daniel Halperin commented on BEAM-168:
--

Like to keep this open until I can remove the deprecated code.

> IntervalBoundedExponentialBackOff change broke Beam-on-Dataflow
> ---
>
> Key: BEAM-168
> URL: https://issues.apache.org/jira/browse/BEAM-168
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
>
> Changing the `int` to a `long` breaks ABI compatibility, which Dataflow 
> service uses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-185) XmlSink output file pattern missing "." in extension

2016-04-26 Thread Scott Wegner (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15258349#comment-15258349
 ] 

Scott Wegner commented on BEAM-185:
---

You're right. I looked into this more, and the usage of this is fixed within 
FileBasedSink. Closing this.

> XmlSink output file pattern missing "." in extension
> 
>
> Key: BEAM-185
> URL: https://issues.apache.org/jira/browse/BEAM-185
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Priority: Minor
>
> The XmlSink takes as input a filename prefix and adds the shard name and 
> extension automatically. However, it is missing the "." when adding the 
> extension.
> For an XmlSink configured as:
> {{XmlSink.write().toFilenamePrefix("foobar");}}
> the fileNamePattern is {{foobar-S-of-Nxml}}. Instead, it should be 
> {{foobar-S-of-N.xml}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (BEAM-185) XmlSink output file pattern missing "." in extension

2016-04-26 Thread Scott Wegner (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Wegner closed BEAM-185.
-
Resolution: Won't Fix

> XmlSink output file pattern missing "." in extension
> 
>
> Key: BEAM-185
> URL: https://issues.apache.org/jira/browse/BEAM-185
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Priority: Minor
>
> The XmlSink takes as input a filename prefix and adds the shard name and 
> extension automatically. However, it is missing the "." when adding the 
> extension.
> For an XmlSink configured as:
> {{XmlSink.write().toFilenamePrefix("foobar");}}
> the fileNamePattern is {{foobar-S-of-Nxml}}. Instead, it should be 
> {{foobar-S-of-N.xml}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: Add DirtyBit to represent whether Cou...

2016-04-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/219


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] incubator-beam git commit: Add DirtyBit to represent whether Counters have been committed.

2016-04-26 Thread bchambers
Repository: incubator-beam
Updated Branches:
  refs/heads/master c59ca3868 -> d299e2c25


Add DirtyBit to represent whether Counters have been committed.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/72cfa709
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/72cfa709
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/72cfa709

Branch: refs/heads/master
Commit: 72cfa709689496f02cbe00fdc3f281e45fa11b23
Parents: c59ca38
Author: Pei He 
Authored: Tue Apr 19 18:21:40 2016 -0700
Committer: bchambers 
Committed: Tue Apr 26 08:24:30 2016 -0700

--
 .../apache/beam/sdk/util/common/Counter.java| 459 ---
 .../beam/sdk/util/common/CounterTest.java   | 146 +-
 2 files changed, 448 insertions(+), 157 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/72cfa709/sdks/java/core/src/main/java/org/apache/beam/sdk/util/common/Counter.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/util/common/Counter.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/util/common/Counter.java
index 6024576..9f9b0c1 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/util/common/Counter.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/util/common/Counter.java
@@ -25,6 +25,7 @@ import static 
com.google.common.base.Preconditions.checkArgument;
 
 import org.apache.beam.sdk.values.TypeDescriptor;
 
+import com.google.common.annotations.VisibleForTesting;
 import com.google.common.util.concurrent.AtomicDouble;
 
 import java.util.Objects;
@@ -44,6 +45,14 @@ import javax.annotation.Nullable;
  * Counters compare using value equality of their name, kind, and
  * cumulative value.  Equal counters should have equal toString()s.
  *
+ * After all possible mutations have completed, the reader should check
+ * {@link #isDirty} for every counter, otherwise updates may be lost.
+ *
+ * A counter may become dirty without a corresponding update to the value.
+ * This generally will occur when the calls to {@code addValue()}, {@code 
committing()},
+ * and {@code committed()} are interleaved such that the value is updated
+ * between the calls to committing and the read of the value.
+ *
  * @param  the type of values aggregated by this counter
  */
 public abstract class Counter {
@@ -295,6 +304,76 @@ public abstract class Counter {
   public abstract CounterMean getMean();
 
   /**
+   * Represents whether counters' values have been committed to the backend.
+   *
+   * Runners can use this information to optimize counters updates.
+   * For example, if counters are committed, runners may choose to skip the 
updates.
+   *
+   * Counters' state transition table:
+   * {@code
+   * Action\Current State COMMITTEDDIRTYCOMMITTING
+   * addValue()   DIRTYDIRTYDIRTY
+   * committing() None COMMITTING   None
+   * committed()  None None COMMITTED
+   * }
+   */
+  @VisibleForTesting
+  enum CommitState {
+/**
+ * There are no local updates that haven't been committed to the backend.
+ */
+COMMITTED,
+/**
+ * There are local updates that haven't been committed to the backend.
+ */
+DIRTY,
+/**
+ * Local updates are committing to the backend, but are still pending.
+ */
+COMMITTING,
+  }
+
+  /**
+   * Returns if the counter contains non-committed aggregate.
+   */
+  public boolean isDirty() {
+return commitState.get() != CommitState.COMMITTED;
+  }
+
+  /**
+   * Changes the counter from {@code CommitState.DIRTY} to {@code 
CommitState.COMMITTING}.
+   *
+   * @return true if successful. False return indicates that the commit state
+   * was not in {@code CommitState.DIRTY}.
+   */
+  public boolean committing() {
+return commitState.compareAndSet(CommitState.DIRTY, 
CommitState.COMMITTING);
+  }
+
+  /**
+   * Changes the counter from {@code CommitState.COMMITTING} to {@code 
CommitState.COMMITTED}.
+   *
+   * @return true if successful.
+   *
+   * False return indicates that the counter was updated while the 
committing is pending.
+   * That counter update might or might not has been committed. The {@code 
commitState} has to
+   * stay in {@code CommitState.DIRTY}.
+   */
+  public boolean committed() {
+return commitState.compareAndSet(CommitState.COMMITTING, 
CommitState.COMMITTED);
+  }
+
+  /**
+   * Sets the counter to {@code CommitState.DIRTY}.
+   *
+   * Must be called at the end of {@link #addValue}, {@link #resetToValue},
+   * {@link #resetMeanToValue}, and {@link #merge}.
+   */
+  protected void setDirty() {
+commitState.set(CommitState.DIR

[2/2] incubator-beam git commit: This closes #219

2016-04-26 Thread bchambers
This closes #219


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/d299e2c2
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/d299e2c2
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/d299e2c2

Branch: refs/heads/master
Commit: d299e2c25a37890789c0de1bc557f7da0d94c0f0
Parents: c59ca38 72cfa70
Author: bchambers 
Authored: Tue Apr 26 08:24:36 2016 -0700
Committer: bchambers 
Committed: Tue Apr 26 08:24:36 2016 -0700

--
 .../apache/beam/sdk/util/common/Counter.java| 459 ---
 .../beam/sdk/util/common/CounterTest.java   | 146 +-
 2 files changed, 448 insertions(+), 157 deletions(-)
--




[jira] [Commented] (BEAM-117) Implement the API for Static Display Metadata

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15258326#comment-15258326
 ] 

ASF GitHub Bot commented on BEAM-117:
-

GitHub user swegner opened a pull request:

https://github.com/apache/incubator-beam/pull/241

[BEAM-117] Minor updates for DisplayData usage

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This PR addresses the following JIRA issues:

* [[BEAM-195] Rename 
DisplayDataMatchers#includes](https://issues.apache.org/jira/browse/BEAM-195)
* [[BEAM-200] Standardize terminology to "display data" in 
documentation](https://issues.apache.org/jira/browse/BEAM-200)
* [[BEAM-219] Use super.populateDisplayData 
consistently](https://issues.apache.org/jira/browse/BEAM-219)

[[BEAM-199] Improve fluent interface for manipulating 
DisplayData.Items](https://issues.apache.org/jira/browse/BEAM-199) will be in 
the next PR as to not bloat this one too much.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/swegner/incubator-beam displaydata-tweaks

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/241.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #241


commit 08b475f1971b3f74c9b321941cdbe2818c25c939
Author: Scott Wegner 
Date:   2016-04-25T16:31:13Z

Rename DisplayDataMatchers.includes to includesDisplayData to clarify usage

commit 5a1d93ce01000df8577acd66d6e9c6ebf5d6948f
Author: Scott Wegner 
Date:   2016-04-25T16:34:58Z

Normalize comments to refer to 'display data' rather than 'display metadata'

commit 954e734fbde4a055ee97d84b9f1c91d598fc5de9
Author: Scott Wegner 
Date:   2016-04-25T16:41:07Z

Add super.populateDisplayData() to standard implementations. The current 
super implementation is a no-op, but this is the recommended way to implement 
the pattern




> Implement the API for Static Display Metadata
> -
>
> Key: BEAM-117
> URL: https://issues.apache.org/jira/browse/BEAM-117
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Ben Chambers
>Assignee: Scott Wegner
>
> As described in the following doc, we would like the SDK to allow associating 
> display metadata with PTransforms.
> https://docs.google.com/document/d/11enEB9JwVp6vO0uOYYTMYTGkr3TdNfELwWqoiUg5ZxM/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: [BEAM-117] Minor updates for DisplayD...

2016-04-26 Thread swegner
GitHub user swegner opened a pull request:

https://github.com/apache/incubator-beam/pull/241

[BEAM-117] Minor updates for DisplayData usage

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This PR addresses the following JIRA issues:

* [[BEAM-195] Rename 
DisplayDataMatchers#includes](https://issues.apache.org/jira/browse/BEAM-195)
* [[BEAM-200] Standardize terminology to "display data" in 
documentation](https://issues.apache.org/jira/browse/BEAM-200)
* [[BEAM-219] Use super.populateDisplayData 
consistently](https://issues.apache.org/jira/browse/BEAM-219)

[[BEAM-199] Improve fluent interface for manipulating 
DisplayData.Items](https://issues.apache.org/jira/browse/BEAM-199) will be in 
the next PR as to not bloat this one too much.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/swegner/incubator-beam displaydata-tweaks

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/241.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #241


commit 08b475f1971b3f74c9b321941cdbe2818c25c939
Author: Scott Wegner 
Date:   2016-04-25T16:31:13Z

Rename DisplayDataMatchers.includes to includesDisplayData to clarify usage

commit 5a1d93ce01000df8577acd66d6e9c6ebf5d6948f
Author: Scott Wegner 
Date:   2016-04-25T16:34:58Z

Normalize comments to refer to 'display data' rather than 'display metadata'

commit 954e734fbde4a055ee97d84b9f1c91d598fc5de9
Author: Scott Wegner 
Date:   2016-04-25T16:41:07Z

Add super.populateDisplayData() to standard implementations. The current 
super implementation is a no-op, but this is the recommended way to implement 
the pattern




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (BEAM-229) Add support for additional Spark configuration via PipelineOptions

2016-04-26 Thread Amit Sela (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Sela updated BEAM-229:
---
Component/s: runner-spark

> Add support for additional Spark configuration via PipelineOptions
> --
>
> Key: BEAM-229
> URL: https://issues.apache.org/jira/browse/BEAM-229
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Amit Sela
>
> Provide support for at least some of the Spark configurations via 
> PipelineOptions - memory, cores, etc.
> Need to pass PipelineOptions to 
> https://github.com/apache/incubator-beam/blob/master/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/SparkContextFactory.java
>  where the SparkConf is created. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-229) Add support for additional Spark configuration via PipelineOptions

2016-04-26 Thread Amit Sela (JIRA)
Amit Sela created BEAM-229:
--

 Summary: Add support for additional Spark configuration via 
PipelineOptions
 Key: BEAM-229
 URL: https://issues.apache.org/jira/browse/BEAM-229
 Project: Beam
  Issue Type: Improvement
Reporter: Amit Sela


Provide support for at least some of the Spark configurations via 
PipelineOptions - memory, cores, etc.

Need to pass PipelineOptions to 
https://github.com/apache/incubator-beam/blob/master/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/SparkContextFactory.java
 where the SparkConf is created. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-129) Support pubsub IO

2016-04-26 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15257826#comment-15257826
 ] 

Maximilian Michels commented on BEAM-129:
-

Great, getting excited :)

> Support pubsub IO
> -
>
> Key: BEAM-129
> URL: https://issues.apache.org/jira/browse/BEAM-129
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Kostas Kloudas
>
> Support pubsub IO



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)