Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3899

2017-09-21 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #4021

2017-09-21 Thread Apache Jenkins Server
See 




[jira] [Created] (BEAM-2980) BagState.isEmpty needs a tighter spec

2017-09-21 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-2980:
-

 Summary: BagState.isEmpty needs a tighter spec
 Key: BEAM-2980
 URL: https://issues.apache.org/jira/browse/BEAM-2980
 Project: Beam
  Issue Type: Bug
  Components: beam-model
Reporter: Kenneth Knowles
Assignee: Kenneth Knowles


Consider the following:

{code}
BagState myBag = // empty
ReadableState isMyBagEmpty = myBag.isEmpty();
myBag.add(bizzle);
bool empty = isMyBagEmpty.read();
{code}

Should {{empty}} be true or false? We need a consistent answer, across all 
kinds of state, when snapshots are required.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is unstable: beam_PostCommit_Java_MavenInstall #4851

2017-09-21 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #4020

2017-09-21 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3898

2017-09-21 Thread Apache Jenkins Server
See 




[jira] [Resolved] (BEAM-2956) DataflowRunner incorrectly reports the user agent for the Dataflow distribution

2017-09-21 Thread Reuven Lax (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuven Lax resolved BEAM-2956.
--
Resolution: Fixed

> DataflowRunner incorrectly reports the user agent for the Dataflow 
> distribution
> ---
>
> Key: BEAM-2956
> URL: https://issues.apache.org/jira/browse/BEAM-2956
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Luke Cwik
>Assignee: Luke Cwik
> Fix For: 2.2.0
>
>
> The DataflowRunner when distributed with the Dataflow SDK distribution may 
> incorrectly submit a user agent and properties from the Apache Beam 
> distribution.
> This occurs when the Apache Beam jars appear on the classpath before the 
> Dataflow SDK distribution. The fix is to not have two files at the same path 
> but to use two different paths, where the lack of the second path means that 
> we are using the Apache Beam distribution and its existence implies we are 
> using the Dataflow distribution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (BEAM-2834) NullPointerException @ BigQueryServicesImpl.java:759

2017-09-21 Thread Reuven Lax (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuven Lax closed BEAM-2834.

Resolution: Fixed

> NullPointerException @ BigQueryServicesImpl.java:759
> 
>
> Key: BEAM-2834
> URL: https://issues.apache.org/jira/browse/BEAM-2834
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.1.0
>Reporter: Andy Barron
>Assignee: Reuven Lax
> Fix For: 2.2.0
>
>
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:759)
>   at 
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:809)
>   at 
> org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn.flushRows(StreamingWriteFn.java:126)
>   at 
> org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn.finishBundle(StreamingWriteFn.java:96)
> {code}
> Going through the stack trace, the likely culprit is a null {{retryPolicy}} 
> in {{StreamingWriteFn}}.
> For context, this error showed up about 70 times between 1 am and 1 pm 
> Pacific time (2017-08-31) on a Dataflow streaming job.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (BEAM-2870) BQ Partitioned Table Write Fails When Destination has Partition Decorator

2017-09-21 Thread Reuven Lax (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuven Lax closed BEAM-2870.

Resolution: Fixed

> BQ Partitioned Table Write Fails When Destination has Partition Decorator
> -
>
> Key: BEAM-2870
> URL: https://issues.apache.org/jira/browse/BEAM-2870
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.2.0
> Environment: Dataflow Runner, Streaming, 10 x (n1-highmem-8 & 500gb 
> SDD)
>Reporter: Steven Jon Anderson
>Assignee: Reuven Lax
>  Labels: bigquery, dataflow, google, google-cloud-bigquery, 
> google-dataflow
> Fix For: 2.2.0
>
>
> Dataflow Job ID: 
> https://console.cloud.google.com/dataflow/job/2017-09-08_23_03_14-14637186041605198816
> Tagging [~reuvenlax] as I believe he built the time partitioning integration 
> that was merged into master.
> *Background*
> Our production pipeline ingests millions of events per day and routes events 
> into our clients' numerous tables. To keep costs down, all of our tables are 
> partitioned. However, this requires that we create the tables before we allow 
> events to process as creating partitioned tables isn't supported in 2.1.0. 
> We've been looking forward to [~reuvenlax]'s partition table write feature 
> ([#3663|https://github.com/apache/beam/pull/3663]) to get merged into master 
> for some time now as it'll allow us to launch our client platforms much, much 
> faster. Today we got around to testing the 2.2.0 nightly and discovered this 
> bug.
> *Issue*
> Our pipeline writes to a table with a decorator. When attempting to write to 
> an existing partitioned table with a decorator, the write succeeds. When 
> using a partitioned table destination that doesn't exist without a decorator, 
> the write succeeds. *However, when writing to a partitioned table that 
> doesn't exist with a decorator, the write fails*. 
> *Example Implementation*
> {code:java}
> BigQueryIO.writeTableRows()
>   .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
>   .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
>   .withFailedInsertRetryPolicy(InsertRetryPolicy.alwaysRetry())
>   .to(new DynamicDestinations() {
> @Override
> public String getDestination(ValueInSingleWindow element) {
>   return "PROJECT_ID:DATASET_ID.TABLE_ID$20170902";
> }
> @Override
> public TableDestination getTable(String destination) {
>   TimePartitioning DAY_PARTITION = new TimePartitioning().setType("DAY");
>   return new TableDestination(destination, null, DAY_PARTITION);
> }
> @Override
> public TableSchema getSchema(String destination) {
>   return TABLE_SCHEMA;
> }
>   })
> {code}
> *Relevant Logs & Errors in StackDriver*
> {code:none}
> 23:06:26.790 
> Trying to create BigQuery table: PROJECT_ID:DATASET_ID.TABLE_ID$20170902
> 23:06:26.873 
> Invalid table ID \"TABLE_ID$20170902\". Table IDs must be alphanumeric (plus 
> underscores) and must be at most 1024 characters long. Also, Table decorators 
> cannot be used.
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3897

2017-09-21 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2870) BQ Partitioned Table Write Fails When Destination has Partition Decorator

2017-09-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16175838#comment-16175838
 ] 

ASF GitHub Bot commented on BEAM-2870:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3866


> BQ Partitioned Table Write Fails When Destination has Partition Decorator
> -
>
> Key: BEAM-2870
> URL: https://issues.apache.org/jira/browse/BEAM-2870
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.2.0
> Environment: Dataflow Runner, Streaming, 10 x (n1-highmem-8 & 500gb 
> SDD)
>Reporter: Steven Jon Anderson
>Assignee: Reuven Lax
>  Labels: bigquery, dataflow, google, google-cloud-bigquery, 
> google-dataflow
> Fix For: 2.2.0
>
>
> Dataflow Job ID: 
> https://console.cloud.google.com/dataflow/job/2017-09-08_23_03_14-14637186041605198816
> Tagging [~reuvenlax] as I believe he built the time partitioning integration 
> that was merged into master.
> *Background*
> Our production pipeline ingests millions of events per day and routes events 
> into our clients' numerous tables. To keep costs down, all of our tables are 
> partitioned. However, this requires that we create the tables before we allow 
> events to process as creating partitioned tables isn't supported in 2.1.0. 
> We've been looking forward to [~reuvenlax]'s partition table write feature 
> ([#3663|https://github.com/apache/beam/pull/3663]) to get merged into master 
> for some time now as it'll allow us to launch our client platforms much, much 
> faster. Today we got around to testing the 2.2.0 nightly and discovered this 
> bug.
> *Issue*
> Our pipeline writes to a table with a decorator. When attempting to write to 
> an existing partitioned table with a decorator, the write succeeds. When 
> using a partitioned table destination that doesn't exist without a decorator, 
> the write succeeds. *However, when writing to a partitioned table that 
> doesn't exist with a decorator, the write fails*. 
> *Example Implementation*
> {code:java}
> BigQueryIO.writeTableRows()
>   .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
>   .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
>   .withFailedInsertRetryPolicy(InsertRetryPolicy.alwaysRetry())
>   .to(new DynamicDestinations() {
> @Override
> public String getDestination(ValueInSingleWindow element) {
>   return "PROJECT_ID:DATASET_ID.TABLE_ID$20170902";
> }
> @Override
> public TableDestination getTable(String destination) {
>   TimePartitioning DAY_PARTITION = new TimePartitioning().setType("DAY");
>   return new TableDestination(destination, null, DAY_PARTITION);
> }
> @Override
> public TableSchema getSchema(String destination) {
>   return TABLE_SCHEMA;
> }
>   })
> {code}
> *Relevant Logs & Errors in StackDriver*
> {code:none}
> 23:06:26.790 
> Trying to create BigQuery table: PROJECT_ID:DATASET_ID.TABLE_ID$20170902
> 23:06:26.873 
> Invalid table ID \"TABLE_ID$20170902\". Table IDs must be alphanumeric (plus 
> underscores) and must be at most 1024 characters long. Also, Table decorators 
> cannot be used.
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[2/3] beam git commit: Strip table decorators before creating tables.

2017-09-21 Thread reuvenlax
Strip table decorators before creating tables.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/0f50eb75
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/0f50eb75
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/0f50eb75

Branch: refs/heads/master
Commit: 0f50eb759dd3c810f0ef70d66a8077df227cb372
Parents: 0a073af
Author: Reuven Lax 
Authored: Tue Sep 19 11:38:13 2017 -0700
Committer: Reuven Lax 
Committed: Thu Sep 21 20:16:22 2017 -0700

--
 .../apache/beam/sdk/io/gcp/bigquery/CreateTables.java|  2 +-
 .../beam/sdk/io/gcp/bigquery/TableDestination.java   | 10 ++
 .../apache/beam/sdk/io/gcp/bigquery/BigQueryIOTest.java  | 11 +++
 3 files changed, 22 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/0f50eb75/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/CreateTables.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/CreateTables.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/CreateTables.java
index 7f83b83..aff5ff1 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/CreateTables.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/CreateTables.java
@@ -113,7 +113,7 @@ public class CreateTables
   private void possibleCreateTable(
   BigQueryOptions options, TableDestination tableDestination, TableSchema 
tableSchema)
   throws InterruptedException, IOException {
-String tableSpec = tableDestination.getTableSpec();
+String tableSpec = tableDestination.getStrippedTableSpec();
 TableReference tableReference = tableDestination.getTableReference();
 String tableDescription = tableDestination.getTableDescription();
 if (createDisposition != createDisposition.CREATE_NEVER && 
!createdTables.contains(tableSpec)) {

http://git-wip-us.apache.org/repos/asf/beam/blob/0f50eb75/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/TableDestination.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/TableDestination.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/TableDestination.java
index 79f1b22..4a4f66b 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/TableDestination.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/TableDestination.java
@@ -30,6 +30,7 @@ import javax.annotation.Nullable;
 public class TableDestination implements Serializable {
   private static final long serialVersionUID = 1L;
   private final String tableSpec;
+  @Nullable private String strippedTableSpec;
   @Nullable
   private final String tableDescription;
   @Nullable
@@ -59,6 +60,7 @@ public class TableDestination implements Serializable {
   public TableDestination(String tableSpec, @Nullable String tableDescription,
   @Nullable String jsonTimePartitioning) {
 this.tableSpec = tableSpec;
+this.strippedTableSpec = null;
 this.tableDescription = tableDescription;
 this.jsonTimePartitioning = jsonTimePartitioning;
   }
@@ -68,6 +70,14 @@ public class TableDestination implements Serializable {
 return tableSpec;
   }
 
+  public String getStrippedTableSpec() {
+if (strippedTableSpec == null) {
+  int index = tableSpec.lastIndexOf('$');
+  strippedTableSpec = (index  == -1) ? tableSpec : tableSpec.substring(0, 
index);
+}
+return strippedTableSpec;
+  }
+
   public TableReference getTableReference() {
 return BigQueryHelpers.parseTableSpec(tableSpec);
   }

http://git-wip-us.apache.org/repos/asf/beam/blob/0f50eb75/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIOTest.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIOTest.java
 
b/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIOTest.java
index 9120507..7927282 100644
--- 
a/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIOTest.java
+++ 
b/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIOTest.java
@@ -2343,4 +2343,15 @@ public class BigQueryIOTest implements Serializable {
 }
 return converted;
   }
+
+  @Test
+  public void testTableDecoratorStr

[GitHub] beam pull request #3866: [BEAM-2870] Strip table decorators before creating ...

2017-09-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3866


---


[3/3] beam git commit: This closes #3866

2017-09-21 Thread reuvenlax
This closes #3866


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/66b864f2
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/66b864f2
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/66b864f2

Branch: refs/heads/master
Commit: 66b864f2b3f95780f8952bafcafd8796ea931c2b
Parents: 0a073af fb6417f
Author: Reuven Lax 
Authored: Thu Sep 21 20:17:38 2017 -0700
Committer: Reuven Lax 
Committed: Thu Sep 21 20:17:38 2017 -0700

--
 .../sdk/io/gcp/bigquery/BigQueryHelpers.java|  8 
 .../beam/sdk/io/gcp/bigquery/CreateTables.java  |  8 ++--
 .../sdk/io/gcp/bigquery/TableDestination.java   |  1 -
 .../sdk/io/gcp/bigquery/BigQueryIOTest.java | 46 ++--
 .../sdk/io/gcp/bigquery/FakeDatasetService.java | 15 ++-
 5 files changed, 70 insertions(+), 8 deletions(-)
--




[1/3] beam git commit: Move stripping code into BigQueryHelpers and add better unit-test coverage.

2017-09-21 Thread reuvenlax
Repository: beam
Updated Branches:
  refs/heads/master 0a073af40 -> 66b864f2b


Move stripping code into BigQueryHelpers and add better unit-test coverage.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/fb6417f8
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/fb6417f8
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/fb6417f8

Branch: refs/heads/master
Commit: fb6417f81482d22ef1cff9505a6589360e506dc0
Parents: 0f50eb7
Author: Reuven Lax 
Authored: Tue Sep 19 20:24:18 2017 -0700
Committer: Reuven Lax 
Committed: Thu Sep 21 20:16:22 2017 -0700

--
 .../sdk/io/gcp/bigquery/BigQueryHelpers.java|  8 
 .../beam/sdk/io/gcp/bigquery/CreateTables.java  |  8 ++--
 .../sdk/io/gcp/bigquery/TableDestination.java   | 11 -
 .../sdk/io/gcp/bigquery/BigQueryIOTest.java | 47 
 .../sdk/io/gcp/bigquery/FakeDatasetService.java | 15 ++-
 5 files changed, 65 insertions(+), 24 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/fb6417f8/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java
index 7f9e27a..02a47c2 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java
@@ -112,6 +112,14 @@ public class BigQueryHelpers {
 return 
ref.setDatasetId(match.group("DATASET")).setTableId(match.group("TABLE"));
   }
 
+  /**
+   * Strip off any partition decorator information from a tablespec.
+   */
+  public static String stripPartitionDecorator(String tableSpec) {
+int index = tableSpec.lastIndexOf('$');
+return  (index  == -1) ? tableSpec : tableSpec.substring(0, index);
+  }
+
   static String jobToPrettyString(@Nullable Job job) throws IOException {
 return job == null ? "null" : job.toPrettyString();
   }

http://git-wip-us.apache.org/repos/asf/beam/blob/fb6417f8/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/CreateTables.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/CreateTables.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/CreateTables.java
index aff5ff1..fedd2fe 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/CreateTables.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/CreateTables.java
@@ -113,9 +113,7 @@ public class CreateTables
   private void possibleCreateTable(
   BigQueryOptions options, TableDestination tableDestination, TableSchema 
tableSchema)
   throws InterruptedException, IOException {
-String tableSpec = tableDestination.getStrippedTableSpec();
-TableReference tableReference = tableDestination.getTableReference();
-String tableDescription = tableDestination.getTableDescription();
+String tableSpec = 
BigQueryHelpers.stripPartitionDecorator(tableDestination.getTableSpec());
 if (createDisposition != createDisposition.CREATE_NEVER && 
!createdTables.contains(tableSpec)) {
   synchronized (createdTables) {
 // Another thread may have succeeded in creating the table in the 
meanwhile, so
@@ -123,6 +121,10 @@ public class CreateTables
 // every thread from attempting a create and overwhelming our BigQuery 
quota.
 DatasetService datasetService = bqServices.getDatasetService(options);
 if (!createdTables.contains(tableSpec)) {
+  TableReference tableReference = tableDestination.getTableReference();
+  String tableDescription = tableDestination.getTableDescription();
+  tableReference.setTableId(
+  
BigQueryHelpers.stripPartitionDecorator(tableReference.getTableId()));
   if (datasetService.getTable(tableReference) == null) {
 Table table = new Table()
 .setTableReference(tableReference)

http://git-wip-us.apache.org/repos/asf/beam/blob/fb6417f8/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/TableDestination.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/TableDestination.java
 
b/sdks/java/io/google-cloud-platfo

[3/3] beam git commit: This closes 3864

2017-09-21 Thread reuvenlax
This closes 3864


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/0a073af4
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/0a073af4
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/0a073af4

Branch: refs/heads/master
Commit: 0a073af407bf92f58240c265851699a5c782a317
Parents: 0d5d00d 63b60b1
Author: Reuven Lax 
Authored: Thu Sep 21 20:10:01 2017 -0700
Committer: Reuven Lax 
Committed: Thu Sep 21 20:10:01 2017 -0700

--
 .../beam/sdk/io/gcp/bigquery/BigQueryIO.java|  6 ++-
 .../sdk/io/gcp/bigquery/BigQueryIOTest.java | 51 ++--
 2 files changed, 53 insertions(+), 4 deletions(-)
--




[2/3] beam git commit: Make sure that we default to alwaysRetry instead of passing in a null retry policy.

2017-09-21 Thread reuvenlax
Make sure that we default to alwaysRetry instead of passing in a null retry 
policy.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/cc24c86e
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/cc24c86e
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/cc24c86e

Branch: refs/heads/master
Commit: cc24c86e5e17f9ac2ede45a6b6904dd23e90c014
Parents: 0d5d00d
Author: Reuven Lax 
Authored: Tue Sep 19 10:40:12 2017 -0700
Committer: Reuven Lax 
Committed: Thu Sep 21 20:09:21 2017 -0700

--
 .../java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java   | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/cc24c86e/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
index 3cb0d3b..3a4b699 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
@@ -34,6 +34,7 @@ import com.google.api.services.bigquery.model.TableSchema;
 import com.google.api.services.bigquery.model.TimePartitioning;
 import com.google.auto.value.AutoValue;
 import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.MoreObjects;
 import com.google.common.base.Predicates;
 import com.google.common.base.Strings;
 import com.google.common.collect.ImmutableList;
@@ -1278,9 +1279,12 @@ public class BigQueryIO {
 getWriteDisposition() != WriteDisposition.WRITE_TRUNCATE,
 "WriteDisposition.WRITE_TRUNCATE is not supported for an unbounded"
 + " PCollection.");
+InsertRetryPolicy retryPolicy = MoreObjects.firstNonNull(
+getFailedInsertRetryPolicy(), InsertRetryPolicy.alwaysRetry());
+
 StreamingInserts streamingInserts =
 new StreamingInserts<>(getCreateDisposition(), dynamicDestinations)
-.withInsertRetryPolicy(getFailedInsertRetryPolicy())
+.withInsertRetryPolicy(retryPolicy)
 .withTestServices((getBigQueryServices()));
 return rowsWithDestination.apply(streamingInserts);
   } else {



[1/3] beam git commit: Add unit-test coverage.

2017-09-21 Thread reuvenlax
Repository: beam
Updated Branches:
  refs/heads/master 0d5d00d70 -> 0a073af40


Add unit-test coverage.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/63b60b1c
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/63b60b1c
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/63b60b1c

Branch: refs/heads/master
Commit: 63b60b1c7438d78702332b4cbc41cf717f2089b7
Parents: cc24c86
Author: Reuven Lax 
Authored: Tue Sep 19 22:11:29 2017 -0700
Committer: Reuven Lax 
Committed: Thu Sep 21 20:09:21 2017 -0700

--
 .../sdk/io/gcp/bigquery/BigQueryIOTest.java | 51 ++--
 1 file changed, 48 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/63b60b1c/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIOTest.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIOTest.java
 
b/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIOTest.java
index c4403b0..9120507 100644
--- 
a/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIOTest.java
+++ 
b/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIOTest.java
@@ -134,7 +134,6 @@ import org.apache.beam.sdk.util.MimeTypes;
 import org.apache.beam.sdk.util.WindowedValue;
 import org.apache.beam.sdk.values.KV;
 import org.apache.beam.sdk.values.PCollection;
-import org.apache.beam.sdk.values.PCollection.IsBounded;
 import org.apache.beam.sdk.values.PCollectionView;
 import org.apache.beam.sdk.values.ShardedKey;
 import org.apache.beam.sdk.values.TupleTag;
@@ -741,6 +740,53 @@ public class BigQueryIOTest implements Serializable {
   }
 
   @Test
+  public void testFailuresNoRetryPolicy() throws Exception {
+BigQueryOptions bqOptions = 
TestPipeline.testingPipelineOptions().as(BigQueryOptions.class);
+bqOptions.setProject("project-id");
+
bqOptions.setTempLocation(testFolder.newFolder("BigQueryIOTest").getAbsolutePath());
+
+FakeDatasetService datasetService = new FakeDatasetService();
+FakeBigQueryServices fakeBqServices = new FakeBigQueryServices()
+.withJobService(new FakeJobService())
+.withDatasetService(datasetService);
+
+datasetService.createDataset("project-id", "dataset-id", "", "");
+
+TableRow row1 = new TableRow().set("name", "a").set("number", "1");
+TableRow row2 = new TableRow().set("name", "b").set("number", "2");
+TableRow row3 = new TableRow().set("name", "c").set("number", "3");
+
+TableDataInsertAllResponse.InsertErrors ephemeralError =
+new TableDataInsertAllResponse.InsertErrors().setErrors(
+ImmutableList.of(new ErrorProto().setReason("timeout")));
+
+datasetService.failOnInsert(
+ImmutableMap.>of(
+row1, ImmutableList.of(ephemeralError, ephemeralError),
+row2, ImmutableList.of(ephemeralError, ephemeralError)));
+
+Pipeline p = TestPipeline.create(bqOptions);
+p.apply(Create.of(row1, row2, row3))
+.apply(
+BigQueryIO.writeTableRows()
+.to("project-id:dataset-id.table-id")
+.withCreateDisposition(CreateDisposition.CREATE_IF_NEEDED)
+.withMethod(Method.STREAMING_INSERTS)
+.withSchema(
+new TableSchema()
+.setFields(
+ImmutableList.of(
+new 
TableFieldSchema().setName("name").setType("STRING"),
+new 
TableFieldSchema().setName("number").setType("INTEGER"
+.withTestServices(fakeBqServices)
+.withoutValidation());
+p.run();
+
+assertThat(datasetService.getAllRows("project-id", "dataset-id", 
"table-id"),
+containsInAnyOrder(row1, row2, row3));
+  }
+
+  @Test
   public void testRetryPolicy() throws Exception {
 BigQueryOptions bqOptions = 
TestPipeline.testingPipelineOptions().as(BigQueryOptions.class);
 bqOptions.setProject("project-id");
@@ -772,9 +818,9 @@ public class BigQueryIOTest implements Serializable {
 Pipeline p = TestPipeline.create(bqOptions);
 PCollection failedRows =
 p.apply(Create.of(row1, row2, row3))
-.setIsBoundedInternal(IsBounded.UNBOUNDED)
 
.apply(BigQueryIO.writeTableRows().to("project-id:dataset-id.table-id")
 .withCreateDisposition(CreateDisposition.CREATE_IF_NEEDED)
+.withMethod(Method.STREAMING_INSERTS)
 .withSchema(new TableSchema().setFields(
 ImmutableList.of(
 new TableFieldSche

Build failed in Jenkins: beam_PostCommit_Java_MavenInstall #4848

2017-09-21 Thread Apache Jenkins Server
See 


Changes:

[tgroh] Initial set of pipeline jobs.

--
[...truncated 1.63 MB...]
2017-09-22T03:00:19.818 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/ow2/asm/asm-commons/5.0.4/asm-commons-5.0.4.jar
 (41 KB at 44.7 KB/sec)
2017-09-22T03:00:19.818 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/slf4j/slf4j-log4j12/1.7.25/slf4j-log4j12-1.7.25.jar
2017-09-22T03:00:19.820 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/ant/ant/1.8.2/ant-1.8.2.jar 
(1889 KB at 2063.8 KB/sec)
2017-09-22T03:00:19.846 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/slf4j/slf4j-log4j12/1.7.25/slf4j-log4j12-1.7.25.jar
 (12 KB at 12.7 KB/sec)
2017-09-22T03:00:19.922 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/com/carrotsearch/randomizedtesting/randomizedtesting-runner/2.3.2/randomizedtesting-runner-2.3.2.jar
 (233 KB at 228.5 KB/sec)
2017-09-22T03:00:20.331 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/solr/solr-core/5.5.4/solr-core-5.5.4.jar
 (3805 KB at 2667.7 KB/sec)
2017-09-22T03:00:20.346 [INFO] Downloading: 
http://maven.restlet.org/org/apache/lucene/lucene-test-framework/5.5.4/lucene-test-framework-5.5.4.jar
2017-09-22T03:00:20.347 [INFO] Downloading: 
http://maven.restlet.org/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar
2017-09-22T03:00:20.347 [INFO] Downloading: 
http://maven.restlet.org/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.jar
2017-09-22T03:00:20.768 [INFO] Downloaded: 
http://maven.restlet.org/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.jar
 (23 KB at 53.3 KB/sec)
2017-09-22T03:00:20.983 [INFO] Downloaded: 
http://maven.restlet.org/org/restlet/jee/org.restlet/2.3.0/org.restlet-2.3.0.jar
 (692 KB at 1085.5 KB/sec)
2017-09-22T03:00:20.985 [INFO] Downloading: 
https://repository.cloudera.com/artifactory/libs-release/org/apache/lucene/lucene-test-framework/5.5.4/lucene-test-framework-5.5.4.jar
[JENKINS] Archiving disabled
2017-09-22T03:00:21.931 [INFO]  
   
2017-09-22T03:00:21.932 [INFO] 

2017-09-22T03:00:21.932 [INFO] Skipping Apache Beam :: Parent
2017-09-22T03:00:21.932 [INFO] This project has been banned from the build due 
to previous failures.
2017-09-22T03:00:21.932 [INFO] 

[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
2017-09-22T03:01:05.281 [INFO] 

2017-09-22T03:01:05.281 [INFO] Reactor Summary:
2017-09-22T03:01:05.281 [INFO] 
2017-09-22T03:01:05.281 [INFO] Apache Beam :: Parent 
.. SUCCESS [ 34.560 s]
2017-09-22T03:01:05.281 [INFO] Apache Beam :: SDKs :: Java :: Build Tools 
. SUCCESS [ 13.155 s]
2017-09-22T03:01:05.281 [INFO] Apache Beam :: SDKs 
 SUCCESS [  6.526 s]
2017-09-22T03:01:05.281 [INFO] Apache Beam :: SDKs :: Common 

Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3896

2017-09-21 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3895

2017-09-21 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2884) Dataflow runs portable pipelines

2017-09-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16175805#comment-16175805
 ] 

ASF GitHub Bot commented on BEAM-2884:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3859


> Dataflow runs portable pipelines
> 
>
> Key: BEAM-2884
> URL: https://issues.apache.org/jira/browse/BEAM-2884
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Henning Rohde
>  Labels: portability
>
> Dataflow should run pipelines using the full portability API as currently 
> defined:
> https://s.apache.org/beam-fn-api 
> https://s.apache.org/beam-runner-api
> https://s.apache.org/beam-job-api
> https://s.apache.org/beam-fn-api-container-contract
> This issue tracks its adoption of the portability framework. New Fn API and 
> other features will be tracked separately.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3859: [BEAM-2884] Send portable protos for ParDo in Dataf...

2017-09-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3859


---


[4/5] beam git commit: Shade generated Runner API classes in Dataflow runner

2017-09-21 Thread kenn
Shade generated Runner API classes in Dataflow runner


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/3c2f3639
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/3c2f3639
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/3c2f3639

Branch: refs/heads/master
Commit: 3c2f36399249bc39ff184b9bc060e6e645e98cdc
Parents: 63e2965
Author: Kenneth Knowles 
Authored: Thu Sep 21 17:59:27 2017 -0700
Committer: Kenneth Knowles 
Committed: Thu Sep 21 17:59:27 2017 -0700

--
 runners/google-cloud-dataflow-java/pom.xml | 5 +
 1 file changed, 5 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/3c2f3639/runners/google-cloud-dataflow-java/pom.xml
--
diff --git a/runners/google-cloud-dataflow-java/pom.xml 
b/runners/google-cloud-dataflow-java/pom.xml
index cd2f70f..79614ae 100644
--- a/runners/google-cloud-dataflow-java/pom.xml
+++ b/runners/google-cloud-dataflow-java/pom.xml
@@ -183,6 +183,7 @@
 com.google.guava:guava
 com.google.protobuf:protobuf-java
 
org.apache.beam:beam-runners-core-construction-java
+
org.apache.beam:beam-sdks-common-runner-api
   
 
 
@@ -219,6 +220,10 @@
 org.apache.beam.runners.core
 
org.apache.beam.runners.dataflow.repackaged.org.apache.beam.runners.core
   
+  
+org.apache.beam.sdk.common.runner
+
org.apache.beam.runners.dataflow.repackaged.org.apache.beam.sdk.common.runner
+  
 
 
   



[1/5] beam git commit: Update Dataflow worker to 20170921

2017-09-21 Thread kenn
Repository: beam
Updated Branches:
  refs/heads/master 4e4d10212 -> 0d5d00d70


Update Dataflow worker to 20170921


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/b4deb838
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/b4deb838
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/b4deb838

Branch: refs/heads/master
Commit: b4deb83882580aa9efd59a98464cd2ba87f65661
Parents: ed4e868
Author: Kenneth Knowles 
Authored: Thu Sep 21 13:57:03 2017 -0700
Committer: Kenneth Knowles 
Committed: Thu Sep 21 13:59:15 2017 -0700

--
 runners/google-cloud-dataflow-java/pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/b4deb838/runners/google-cloud-dataflow-java/pom.xml
--
diff --git a/runners/google-cloud-dataflow-java/pom.xml 
b/runners/google-cloud-dataflow-java/pom.xml
index eb490cb..9076317 100644
--- a/runners/google-cloud-dataflow-java/pom.xml
+++ b/runners/google-cloud-dataflow-java/pom.xml
@@ -33,7 +33,7 @@
   jar
 
   
-
beam-master-20170918
+
beam-master-20170921
 
1
 
6
   



[3/5] beam git commit: DataflowRunner depends on, and shades, protobuf

2017-09-21 Thread kenn
DataflowRunner depends on, and shades, protobuf


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/63e2965b
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/63e2965b
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/63e2965b

Branch: refs/heads/master
Commit: 63e2965b49ba39b6b6f77f023d2fb7267759fc84
Parents: b4deb83
Author: Kenneth Knowles 
Authored: Thu Sep 21 16:05:21 2017 -0700
Committer: Kenneth Knowles 
Committed: Thu Sep 21 16:05:21 2017 -0700

--
 runners/google-cloud-dataflow-java/pom.xml | 10 ++
 1 file changed, 10 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/63e2965b/runners/google-cloud-dataflow-java/pom.xml
--
diff --git a/runners/google-cloud-dataflow-java/pom.xml 
b/runners/google-cloud-dataflow-java/pom.xml
index 9076317..cd2f70f 100644
--- a/runners/google-cloud-dataflow-java/pom.xml
+++ b/runners/google-cloud-dataflow-java/pom.xml
@@ -181,6 +181,7 @@
 
   
 com.google.guava:guava
+com.google.protobuf:protobuf-java
 
org.apache.beam:beam-runners-core-construction-java
   
 
@@ -207,6 +208,10 @@
 
org.apache.beam.runners.dataflow.repackaged.com.google.common
   
   
+com.google.protobuf
+
org.apache.beam.runners.dataflow.repackaged.com.google.protobuf
+  
+  
 com.google.thirdparty
 
org.apache.beam.runners.dataflow.repackaged.com.google.thirdparty
   
@@ -374,6 +379,11 @@
 
 
 
+  com.google.protobuf
+  protobuf-java
+
+
+
   com.fasterxml.jackson.core
   jackson-core
 



[5/5] beam git commit: This closes #3859: [BEAM-2884] Send portable protos for ParDo in DataflowRunner

2017-09-21 Thread kenn
This closes #3859: [BEAM-2884] Send portable protos for ParDo in DataflowRunner

  Shade generated Runner API classes in Dataflow runner
  DataflowRunner depends on, and shades, protobuf
  Update Dataflow worker to 20170921
  Send portable ParDo protos to Dataflow instead of just DoFnInfo


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/0d5d00d7
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/0d5d00d7
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/0d5d00d7

Branch: refs/heads/master
Commit: 0d5d00d7060d6e4ee8273201e3432f14abf35f8a
Parents: 4e4d102 3c2f363
Author: Kenneth Knowles 
Authored: Thu Sep 21 19:25:08 2017 -0700
Committer: Kenneth Knowles 
Committed: Thu Sep 21 19:25:08 2017 -0700

--
 runners/google-cloud-dataflow-java/pom.xml  |  17 ++-
 .../dataflow/DataflowPipelineTranslator.java| 142 +++
 .../runners/dataflow/TransformTranslator.java   |   3 +-
 3 files changed, 136 insertions(+), 26 deletions(-)
--




[2/5] beam git commit: Send portable ParDo protos to Dataflow instead of just DoFnInfo

2017-09-21 Thread kenn
Send portable ParDo protos to Dataflow instead of just DoFnInfo


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/ed4e8684
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/ed4e8684
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/ed4e8684

Branch: refs/heads/master
Commit: ed4e86847e1c72176693f0b9af3812cf64b32fd5
Parents: aed6773
Author: Kenneth Knowles 
Authored: Sat Sep 16 15:26:49 2017 -0700
Committer: Kenneth Knowles 
Committed: Thu Sep 21 13:59:15 2017 -0700

--
 .../dataflow/DataflowPipelineTranslator.java| 142 +++
 .../runners/dataflow/TransformTranslator.java   |   3 +-
 2 files changed, 120 insertions(+), 25 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/ed4e8684/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java
index 4f9b939..354781e 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java
@@ -47,6 +47,7 @@ import com.google.common.collect.BiMap;
 import com.google.common.collect.ImmutableBiMap;
 import com.google.common.collect.ImmutableMap;
 import com.google.common.collect.Iterables;
+import com.google.protobuf.ByteString;
 import java.io.IOException;
 import java.util.ArrayList;
 import java.util.Collections;
@@ -56,6 +57,9 @@ import java.util.List;
 import java.util.Map;
 import java.util.concurrent.atomic.AtomicLong;
 import javax.annotation.Nullable;
+import org.apache.beam.runners.core.construction.PTransformTranslation;
+import org.apache.beam.runners.core.construction.ParDoTranslation;
+import org.apache.beam.runners.core.construction.SdkComponents;
 import org.apache.beam.runners.core.construction.SplittableParDo;
 import org.apache.beam.runners.core.construction.TransformInputs;
 import org.apache.beam.runners.core.construction.WindowingStrategyTranslation;
@@ -73,6 +77,7 @@ import org.apache.beam.sdk.Pipeline;
 import org.apache.beam.sdk.Pipeline.PipelineVisitor;
 import org.apache.beam.sdk.coders.Coder;
 import org.apache.beam.sdk.coders.IterableCoder;
+import org.apache.beam.sdk.common.runner.v1.RunnerApi;
 import org.apache.beam.sdk.io.Read;
 import org.apache.beam.sdk.options.PipelineOptions;
 import org.apache.beam.sdk.options.StreamingOptions;
@@ -439,7 +444,11 @@ public class DataflowPipelineTranslator {
   node.getFullName());
   LOG.debug("Translating {}", transform);
   currentTransform = node.toAppliedPTransform(getPipeline());
-  translator.translate(transform, this);
+  try {
+translator.translate(transform, this);
+  } catch (IOException e) {
+throw new RuntimeException(e);
+  }
   currentTransform = null;
 }
 
@@ -814,27 +823,27 @@ public class DataflowPipelineTranslator {
 ParDo.MultiOutput.class,
 new TransformTranslator() {
   @Override
-  public void translate(ParDo.MultiOutput transform, 
TranslationContext context) {
+  public void translate(ParDo.MultiOutput transform, 
TranslationContext context)
+  throws IOException {
 translateMultiHelper(transform, context);
   }
 
   private  void translateMultiHelper(
-  ParDo.MultiOutput transform, TranslationContext 
context) {
+  ParDo.MultiOutput transform, TranslationContext 
context)
+  throws IOException {
 
 StepTranslationContext stepContext = context.addStep(transform, 
"ParallelDo");
-translateInputs(
-stepContext, context.getInput(transform), 
transform.getSideInputs(), context);
-BiMap> outputMap =
-translateOutputs(context.getOutputs(transform), stepContext);
+PCollection input = (PCollection) 
context.getInput(transform);
+translateInputs(stepContext, input, transform.getSideInputs(), 
context);
 translateFn(
 stepContext,
 transform.getFn(),
-context.getInput(transform).getWindowingStrategy(),
+input,
 transform.getSideInputs(),
 context.getInput(transform).getCoder(),
 context,
-outputMap.inverse().get(transform.getMainOutputTag()),
-outputMap);
+transform.getMain

[jira] [Commented] (BEAM-2596) Break up Jenkins PreCommit into individual steps.

2017-09-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16175804#comment-16175804
 ] 

ASF GitHub Bot commented on BEAM-2596:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3545


> Break up Jenkins PreCommit into individual steps.
> -
>
> Key: BEAM-2596
> URL: https://issues.apache.org/jira/browse/BEAM-2596
> Project: Beam
>  Issue Type: New Feature
>  Components: build-system, testing
>Reporter: Jason Kuster
>Assignee: Jason Kuster
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3545: [BEAM-2596] Pipeline job for Jenkins PreCommit

2017-09-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3545


---


[1/2] beam git commit: This closes #3545

2017-09-21 Thread tgroh
Repository: beam
Updated Branches:
  refs/heads/master d6c63396a -> 4e4d10212


This closes #3545


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/4e4d1021
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/4e4d1021
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/4e4d1021

Branch: refs/heads/master
Commit: 4e4d102124576aefc3f71e432dbf619792e3
Parents: d6c6339 4f7e0d6
Author: Thomas Groh 
Authored: Thu Sep 21 19:21:38 2017 -0700
Committer: Thomas Groh 
Committed: Thu Sep 21 19:21:38 2017 -0700

--
 .test-infra/jenkins/PreCommit_Pipeline.groovy   |  89 +
 .../jenkins/common_job_properties.groovy| 185 ++-
 .test-infra/jenkins/job_beam_Java_Build.groovy  |  82 
 .../jenkins/job_beam_Java_CodeHealth.groovy |  39 
 .../job_beam_Java_IntegrationTest.groovy|  63 +++
 .../jenkins/job_beam_Java_UnitTest.groovy   |  49 +
 .../jenkins/job_beam_PreCommit_Pipeline.groovy  |  81 
 .../jenkins/job_beam_Python_UnitTest.groovy |  40 
 8 files changed, 581 insertions(+), 47 deletions(-)
--




[2/2] beam git commit: Initial set of pipeline jobs.

2017-09-21 Thread tgroh
Initial set of pipeline jobs.

Signed-off-by: Jason Kuster 


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/4f7e0d65
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/4f7e0d65
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/4f7e0d65

Branch: refs/heads/master
Commit: 4f7e0d65c514f022c0675dec50853ac3c7554be7
Parents: d6c6339
Author: Jason Kuster 
Authored: Wed Jun 28 16:22:52 2017 -0700
Committer: Thomas Groh 
Committed: Thu Sep 21 19:21:38 2017 -0700

--
 .test-infra/jenkins/PreCommit_Pipeline.groovy   |  89 +
 .../jenkins/common_job_properties.groovy| 185 ++-
 .test-infra/jenkins/job_beam_Java_Build.groovy  |  82 
 .../jenkins/job_beam_Java_CodeHealth.groovy |  39 
 .../job_beam_Java_IntegrationTest.groovy|  63 +++
 .../jenkins/job_beam_Java_UnitTest.groovy   |  49 +
 .../jenkins/job_beam_PreCommit_Pipeline.groovy  |  81 
 .../jenkins/job_beam_Python_UnitTest.groovy |  40 
 8 files changed, 581 insertions(+), 47 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/4f7e0d65/.test-infra/jenkins/PreCommit_Pipeline.groovy
--
diff --git a/.test-infra/jenkins/PreCommit_Pipeline.groovy 
b/.test-infra/jenkins/PreCommit_Pipeline.groovy
new file mode 100644
index 000..20eaa56
--- /dev/null
+++ b/.test-infra/jenkins/PreCommit_Pipeline.groovy
@@ -0,0 +1,89 @@
+#!groovy
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import hudson.model.Result
+
+int NO_BUILD = -1
+
+// These are args for the GitHub Pull Request Builder (ghprb) Plugin. 
Providing these arguments is
+// necessary due to a bug in the ghprb plugin where environment variables are 
not correctly passed
+// to jobs downstream of a Pipeline job.
+// Tracked by https://github.com/jenkinsci/ghprb-plugin/issues/572.
+List ghprbArgs = [
+string(name: 'ghprbGhRepository', value: "${ghprbGhRepository}"),
+string(name: 'ghprbActualCommit', value: "${ghprbActualCommit}"),
+string(name: 'ghprbPullId', value: "${ghprbPullId}")
+]
+
+// This argument is the commit at which to build.
+List commitArg = [string(name: 'commit', value: 
"origin/pr/${ghprbPullId}/head")]
+
+int javaBuildNum = NO_BUILD
+
+// This (and the below) define "Stages" of a pipeline. These stages run 
serially, and inside can
+// have "parallel" blocks which execute several work steps concurrently. This 
work is limited to
+// simple operations -- more complicated operations need to be performed on an 
actual node. In this
+// case we are using the pipeline to trigger downstream builds.
+stage('Build') {
+parallel (
+java: {
+def javaBuild = build job: 'beam_Java_Build', parameters: 
commitArg + ghprbArgs
+if(javaBuild.getResult() == Result.SUCCESS.toString()) {
+javaBuildNum = javaBuild.getNumber()
+}
+},
+python_unit: { // Python doesn't have a build phase, so we include 
this here.
+build job: 'beam_Python_UnitTest', parameters: commitArg + 
ghprbArgs
+}
+)
+}
+
+// This argument is provided to downstream jobs so they know from which build 
to pull artifacts.
+javaBuildArg = [string(name: 'buildNum', value: "${javaBuildNum}")]
+javaUnitPassed = false
+
+stage('Unit Test / Code Health') {
+parallel (
+java_unit: {
+if(javaBuildNum != NO_BUILD) {
+def javaTest = build job: 'beam_Java_UnitTest', parameters: 
javaBuildArg + ghprbArgs
+if(javaTest.getResult() == Result.SUCCESS.toString()) {
+javaUnitPassed = true
+}
+}
+},
+java_codehealth: {
+if(javaBuildNum != NO_BUILD) {
+build job: 'beam_Java_CodeHealth', parameters: javaBuildArg + 
ghprbArgs
+}
+}
+)
+}
+
+stage('Integration Test') {
+parallel (
+// Not gated on codehealth because codehealth shouldn't affect 

[GitHub] beam pull request #3883: Sets a TTL on BigQueryIO.read().fromQuery() temp da...

2017-09-21 Thread jkff
GitHub user jkff opened a pull request:

https://github.com/apache/beam/pull/3883

Sets a TTL on BigQueryIO.read().fromQuery() temp dataset

Also fixes a bug where we start the query job twice - once to extract the 
files, once to get schema. Luckily it doesn't actually run twice, because 
inserting the same job a second time gives an ignorable error, but it was still 
icky.

Also adds some logging.

R: @reuvenlax 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/incubator-beam bq-temp-ttl

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3883.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3883


commit f92e5b3331be26dfeaee66b65d7bb2e465aa26c9
Author: Eugene Kirpichov 
Date:   2017-09-22T02:01:43Z

Sets a TTL on BigQueryIO.read().fromQuery() temp dataset

Also fixes a bug where we start the query job twice -
once to extract the files, once to get schema. Luckily it doesn't
actually run twice, because inserting the same job a second time gives
an ignorable error, but it was still icky.

Also adds some logging.




---


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #4019

2017-09-21 Thread Apache Jenkins Server
See 




[jira] [Updated] (BEAM-2979) Race condition between KafkaIO.UnboundedKafkaReader.getWatermark() and KafkaIO.UnboundedKafkaReader.advance()

2017-09-21 Thread Wesley Tanaka (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wesley Tanaka updated BEAM-2979:

Description: 
getWatermark() looks like this:

{noformat}
@Override
public Instant getWatermark() {
  if (curRecord == null) {
LOG.debug("{}: getWatermark() : no records have been read yet.", name);
return initialWatermark;
  }

  return source.spec.getWatermarkFn() != null
  ? source.spec.getWatermarkFn().apply(curRecord) : curTimestamp;
}
{noformat}

advance() has code in it that looks like this:

{noformat}
  curRecord = null; // user coders below might throw.

  // apply user deserializers.
  // TODO: write records that can't be deserialized to a "dead-letter" 
additional output.
  KafkaRecord record = new KafkaRecord(
  rawRecord.topic(),
  rawRecord.partition(),
  rawRecord.offset(),
  consumerSpEL.getRecordTimestamp(rawRecord),
  keyDeserializerInstance.deserialize(rawRecord.topic(), 
rawRecord.key()),
  valueDeserializerInstance.deserialize(rawRecord.topic(), 
rawRecord.value()));

  curTimestamp = (source.spec.getTimestampFn() == null)
  ? Instant.now() : source.spec.getTimestampFn().apply(record);
  curRecord = record;
{noformat}

There's a race condition between these two blocks of code which is exposed at 
the very least in the FlinkRunner, which calls getWatermark() periodically from 
a timer.

The symptom of the race condition is a stack trace that looks like this (SDK 
2.0.0):

{noformat}
Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution 
failed.
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:910)
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:853)
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:853)
at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: TimerException{java.lang.NullPointerException}
at 
org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:220)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at org.apache.beam.sdk.io.kafka.KafkaIO$Read$1.apply(KafkaIO.java:568)
at org.apache.beam.sdk.io.kafka.KafkaIO$Read$1.apply(KafkaIO.java:565)
at 
org.apache.beam.sdk.io.kafka.KafkaIO$UnboundedKafkaReader.getWatermark(KafkaIO.java:1210)
at 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.onProcessingTime(UnboundedSourceWrapper.java:431)
at 
org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:218)
... 7 more
{noformat}

Based on inspecting the code, what is probably happening is that while 
advance() is executing:

* Flink runner calls getWatermark()
* getWatermark evaluates (curRecord == null) and it is false
* advance() proceeds to set curRecord = null
* The flink runner thread calls getWatermarkFn().apply(curRecord) which passes 
a null record into the custom watermark function
* If that watermark function had been set with withWatermarkFn() (as suggested 
in the javadoc), then it's using the closure created in unwrapKafkaAndThen()
* That calls record.getKV() and we get the NullPointerException

  was:
getWatermark() looks like this:


[jira] [Created] (BEAM-2979) Race condition between KafkaIO.UnboundedKafkaReader.getWatermark() and KafkaIO.UnboundedKafkaReader.advance()

2017-09-21 Thread Wesley Tanaka (JIRA)
Wesley Tanaka created BEAM-2979:
---

 Summary: Race condition between 
KafkaIO.UnboundedKafkaReader.getWatermark() and 
KafkaIO.UnboundedKafkaReader.advance()
 Key: BEAM-2979
 URL: https://issues.apache.org/jira/browse/BEAM-2979
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Affects Versions: 2.1.0, 2.0.0
Reporter: Wesley Tanaka
Assignee: Kenneth Knowles


getWatermark() looks like this:

{noformat}
@Override
public Instant getWatermark() {
  if (curRecord == null) {
LOG.debug("{}: getWatermark() : no records have been read yet.", name);
return initialWatermark;
  }

  return source.spec.getWatermarkFn() != null
  ? source.spec.getWatermarkFn().apply(curRecord) : curTimestamp;
}
{noformat}

advance() has code in it that looks like this:

{noformat}
  curRecord = null; // user coders below might throw.

  // apply user deserializers.
  // TODO: write records that can't be deserialized to a "dead-letter" 
additional output.
  KafkaRecord record = new KafkaRecord(
  rawRecord.topic(),
  rawRecord.partition(),
  rawRecord.offset(),
  consumerSpEL.getRecordTimestamp(rawRecord),
  keyDeserializerInstance.deserialize(rawRecord.topic(), 
rawRecord.key()),
  valueDeserializerInstance.deserialize(rawRecord.topic(), 
rawRecord.value()));

  curTimestamp = (source.spec.getTimestampFn() == null)
  ? Instant.now() : source.spec.getTimestampFn().apply(record);
  curRecord = record;
{noformat}

There's a race condition between these two blocks of code which is exposed at 
the very least in the FlinkRunner, which calls getWatermark() periodically from 
a timer.

The symptom of the race condition is a stack trace that looks like this:

{noformat}
Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution 
failed.
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:910)
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:853)
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:853)
at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: TimerException{java.lang.NullPointerException}
at 
org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:220)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at org.apache.beam.sdk.io.kafka.KafkaIO$Read$1.apply(KafkaIO.java:568)
at org.apache.beam.sdk.io.kafka.KafkaIO$Read$1.apply(KafkaIO.java:565)
at 
org.apache.beam.sdk.io.kafka.KafkaIO$UnboundedKafkaReader.getWatermark(KafkaIO.java:1210)
at 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.onProcessingTime(UnboundedSourceWrapper.java:431)
at 
org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:218)
... 7 more
{noformat}

Based on inspecting the code, what is probably happening is that while 
advance() is executing:

* Flink runner calls getWatermark()
* getWatermark evaluates (curRecord == null) and it is false
* advance() proceeds to set curRecord = null
* The flink runner thread calls getWatermarkFn().apply(curRecord) which passes 
a null record into the custom

Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #4847

2017-09-21 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : beam_PostCommit_Python_Verify #3187

2017-09-21 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #3882: [BEAM-1630] Adds API for defining Splittable DoFns ...

2017-09-21 Thread chamikaramj
GitHub user chamikaramj opened a pull request:

https://github.com/apache/beam/pull/3882

[BEAM-1630] Adds API for defining Splittable DoFns using Python SDK.

See https://s.apache.org/splittable-do-fn-python-sdk for the design.

This PR and the above doc were updated to reflect following recent updates 
to Splittable DoFn.
* Support for ProcessContinuations
* Support for dynamically updating output watermark irrespective of the 
output element production.

This will be followed by a PR that adds support for reading Splittable 
DoFns using DirectRunner.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chamikaramj/beam sdf_api

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3882.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3882


commit 2fd11b1c0e212a1b267dbafd69c96e26fef4d319
Author: chamik...@google.com 
Date:   2017-09-22T00:43:11Z

Adds API for defining Splittable DoFns.

See https://s.apache.org/splittable-do-fn-python-sdk for the design.

This PR and the above doc were updated to reflect following recent updates 
to Splittable DoFn.
* Support for ProcessContinuations
* Support for dynamically updating output watermark irrespective of the 
output element production.

This will be followed by a PR that adds support for reading Splittable 
DoFns using DirectRunner.




---


[jira] [Commented] (BEAM-1630) Add Splittable DoFn to Python SDK

2017-09-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16175754#comment-16175754
 ] 

ASF GitHub Bot commented on BEAM-1630:
--

GitHub user chamikaramj opened a pull request:

https://github.com/apache/beam/pull/3882

[BEAM-1630] Adds API for defining Splittable DoFns using Python SDK.

See https://s.apache.org/splittable-do-fn-python-sdk for the design.

This PR and the above doc were updated to reflect following recent updates 
to Splittable DoFn.
* Support for ProcessContinuations
* Support for dynamically updating output watermark irrespective of the 
output element production.

This will be followed by a PR that adds support for reading Splittable 
DoFns using DirectRunner.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chamikaramj/beam sdf_api

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3882.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3882


commit 2fd11b1c0e212a1b267dbafd69c96e26fef4d319
Author: chamik...@google.com 
Date:   2017-09-22T00:43:11Z

Adds API for defining Splittable DoFns.

See https://s.apache.org/splittable-do-fn-python-sdk for the design.

This PR and the above doc were updated to reflect following recent updates 
to Splittable DoFn.
* Support for ProcessContinuations
* Support for dynamically updating output watermark irrespective of the 
output element production.

This will be followed by a PR that adds support for reading Splittable 
DoFns using DirectRunner.




> Add Splittable DoFn to Python SDK
> -
>
> Key: BEAM-1630
> URL: https://issues.apache.org/jira/browse/BEAM-1630
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>
> Splittable DoFn [1] is currently being implemented for Java SDK [2]. We 
> should add this to Python SDK as well.
> Following document proposes an API for this.
> https://docs.google.com/document/d/1h_zprJrOilivK2xfvl4L42vaX4DMYGfH1YDmi-s_ozM/edit?usp=sharing
> [1] https://s.apache.org/splittable-do-fn
> [2] 
> https://lists.apache.org/thread.html/0ce61ac162460a149d5c93cdface37cc383f8030fe86ca09e5699b18@%3Cdev.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #4846

2017-09-21 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3894

2017-09-21 Thread Apache Jenkins Server
See 




Jenkins build is unstable: beam_PostCommit_Java_MavenInstall #4845

2017-09-21 Thread Apache Jenkins Server
See 




Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #4018

2017-09-21 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3893

2017-09-21 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #3844: Mark AuctionOrBidWindowCoder as consistentWithEqual...

2017-09-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3844


---


[3/4] beam git commit: Make AuctionOrBidWindowCoder use structuralValue instead of consistentWithEquals

2017-09-21 Thread kenn
Make AuctionOrBidWindowCoder use structuralValue instead of consistentWithEquals


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/e3abd698
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/e3abd698
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/e3abd698

Branch: refs/heads/master
Commit: e3abd6988b9e9bf0ad465dda573870650ddceb2e
Parents: e93c035
Author: Daniel Mills 
Authored: Wed Sep 13 10:21:01 2017 -0700
Committer: Daniel Mills 
Committed: Wed Sep 13 10:21:01 2017 -0700

--
 .../java/org/apache/beam/sdk/nexmark/queries/WinningBids.java  | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/e3abd698/sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/queries/WinningBids.java
--
diff --git 
a/sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/queries/WinningBids.java
 
b/sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/queries/WinningBids.java
index be64958..d73b8ae 100644
--- 
a/sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/queries/WinningBids.java
+++ 
b/sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/queries/WinningBids.java
@@ -167,8 +167,6 @@ public class WinningBids extends 
PTransform, PCollection SUPER_CODER = 
IntervalWindow.getCoder();
 private static final Coder ID_CODER = VarLongCoder.of();
 private static final Coder INT_CODER = VarIntCoder.of();
-private static final boolean CONSISTENT_WITH_EQUALS = 
SUPER_CODER.consistentWithEquals()
-&& ID_CODER.consistentWithEquals() && INT_CODER.consistentWithEquals();
 
 @JsonCreator
 public static AuctionOrBidWindowCoder of() {
@@ -196,8 +194,8 @@ public class WinningBids extends 
PTransform, PCollection

[4/4] beam git commit: This closes #3844: Mark AuctionOrBidWindowCoder as consistentWithEquals

2017-09-21 Thread kenn
This closes #3844: Mark AuctionOrBidWindowCoder as consistentWithEquals

  Make AuctionOrBidWindowCoder use structuralValue instead of 
consistentWithEquals
  Enforce correctness of consistentWithEquals and fix broken hashCode() method 
for AuctionOrBidWindowCoder
  Mark AuctionOrBidWindowCoder as consistentWithEquals


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/d6c63396
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/d6c63396
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/d6c63396

Branch: refs/heads/master
Commit: d6c63396a8d27aaa973b0a403c75d62fb09d9c8e
Parents: 766513b e3abd69
Author: Kenneth Knowles 
Authored: Thu Sep 21 15:36:37 2017 -0700
Committer: Kenneth Knowles 
Committed: Thu Sep 21 15:36:37 2017 -0700

--
 .../java/org/apache/beam/sdk/nexmark/queries/WinningBids.java | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)
--




[2/4] beam git commit: Enforce correctness of consistentWithEquals and fix broken hashCode() method for AuctionOrBidWindowCoder

2017-09-21 Thread kenn
Enforce correctness of consistentWithEquals and fix broken hashCode() method 
for AuctionOrBidWindowCoder


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/e93c0357
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/e93c0357
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/e93c0357

Branch: refs/heads/master
Commit: e93c035711290d8c8ccefdae2e314faef7164179
Parents: bdc738e
Author: Daniel Mills 
Authored: Wed Sep 13 10:08:22 2017 -0700
Committer: Daniel Mills 
Committed: Wed Sep 13 10:08:22 2017 -0700

--
 .../java/org/apache/beam/sdk/nexmark/queries/WinningBids.java  | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/e93c0357/sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/queries/WinningBids.java
--
diff --git 
a/sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/queries/WinningBids.java
 
b/sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/queries/WinningBids.java
index efdbe21..be64958 100644
--- 
a/sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/queries/WinningBids.java
+++ 
b/sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/queries/WinningBids.java
@@ -155,7 +155,7 @@ public class WinningBids extends 
PTransform, PCollection, PCollection SUPER_CODER = 
IntervalWindow.getCoder();
 private static final Coder ID_CODER = VarLongCoder.of();
 private static final Coder INT_CODER = VarIntCoder.of();
+private static final boolean CONSISTENT_WITH_EQUALS = 
SUPER_CODER.consistentWithEquals()
+&& ID_CODER.consistentWithEquals() && INT_CODER.consistentWithEquals();
 
 @JsonCreator
 public static AuctionOrBidWindowCoder of() {
@@ -195,7 +197,7 @@ public class WinningBids extends 
PTransform, PCollection

[1/4] beam git commit: Mark AuctionOrBidWindowCoder as consistentWithEquals

2017-09-21 Thread kenn
Repository: beam
Updated Branches:
  refs/heads/master 766513bc8 -> d6c63396a


Mark AuctionOrBidWindowCoder as consistentWithEquals


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/bdc738e0
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/bdc738e0
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/bdc738e0

Branch: refs/heads/master
Commit: bdc738e0c010829af7f7649f6ea57dc044a18285
Parents: 8e391d9
Author: Daniel Mills 
Authored: Tue Sep 12 16:54:36 2017 -0700
Committer: Daniel Mills 
Committed: Tue Sep 12 16:58:01 2017 -0700

--
 .../java/org/apache/beam/sdk/nexmark/queries/WinningBids.java   | 5 +
 1 file changed, 5 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/bdc738e0/sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/queries/WinningBids.java
--
diff --git 
a/sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/queries/WinningBids.java
 
b/sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/queries/WinningBids.java
index 816a81f..efdbe21 100644
--- 
a/sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/queries/WinningBids.java
+++ 
b/sdks/java/nexmark/src/main/java/org/apache/beam/sdk/nexmark/queries/WinningBids.java
@@ -192,6 +192,11 @@ public class WinningBids extends 
PTransform, PCollection

Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3892

2017-09-21 Thread Apache Jenkins Server
See 




[jira] [Created] (BEAM-2978) add custom window Fn section in programming guide

2017-09-21 Thread Melissa Pashniak (JIRA)
Melissa Pashniak created BEAM-2978:
--

 Summary: add custom window Fn section in programming guide
 Key: BEAM-2978
 URL: https://issues.apache.org/jira/browse/BEAM-2978
 Project: Beam
  Issue Type: Improvement
  Components: website
Reporter: Melissa Pashniak
Assignee: Melissa Pashniak






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2975) Results of ReadableState.read() should be snapshots of the underlying state

2017-09-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16175580#comment-16175580
 ] 

ASF GitHub Bot commented on BEAM-2975:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3876


> Results of ReadableState.read() should be snapshots of the underlying state
> ---
>
> Key: BEAM-2975
> URL: https://issues.apache.org/jira/browse/BEAM-2975
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Daniel Mills
>Assignee: Daniel Mills
>Priority: Minor
>
> Future modification of state should not be reflected in previous calls to 
> read().  For example:
> @StateId("tag") BagState state;
> Iterable ints = state.read();
> state.add(17);
> // ints should still be empty here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3876: [BEAM-2975] Clarify semantics of objects returned b...

2017-09-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3876


---


[2/2] beam git commit: Clarify semantics of objects returned by state access

2017-09-21 Thread tgroh
Clarify semantics of objects returned by state access


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/77840fa3
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/77840fa3
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/77840fa3

Branch: refs/heads/master
Commit: 77840fa3565f6e0ba625556b3fcaff9fa408aca2
Parents: aed6773
Author: Daniel Mills 
Authored: Wed Sep 20 16:35:06 2017 -0700
Committer: Thomas Groh 
Committed: Thu Sep 21 15:32:42 2017 -0700

--
 .../runners/core/InMemoryStateInternals.java| 39 +--
 .../CopyOnAccessInMemoryStateInternalsTest.java | 74 +++-
 .../apache/beam/sdk/state/GroupingState.java| 12 +++-
 .../org/apache/beam/sdk/state/MapState.java | 20 +-
 .../apache/beam/sdk/state/ReadableState.java|  4 ++
 .../org/apache/beam/sdk/state/SetState.java | 10 ++-
 .../apache/beam/sdk/transforms/ParDoTest.java   | 44 +---
 7 files changed, 148 insertions(+), 55 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/77840fa3/runners/core-java/src/main/java/org/apache/beam/runners/core/InMemoryStateInternals.java
--
diff --git 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/InMemoryStateInternals.java
 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/InMemoryStateInternals.java
index 59814bc..075e264 100644
--- 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/InMemoryStateInternals.java
+++ 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/InMemoryStateInternals.java
@@ -17,8 +17,12 @@
  */
 package org.apache.beam.runners.core;
 
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableSet;
+import com.google.common.collect.Iterables;
 import java.util.ArrayList;
 import java.util.Arrays;
+import java.util.Collection;
 import java.util.HashMap;
 import java.util.HashSet;
 import java.util.List;
@@ -326,7 +330,8 @@ public class InMemoryStateInternals implements 
StateInternals {
 
 @Override
 public OutputT read() {
-  return combineFn.extractOutput(accum);
+  return combineFn.extractOutput(
+  
combineFn.mergeAccumulators(Arrays.asList(combineFn.createAccumulator(), 
accum)));
 }
 
 @Override
@@ -407,7 +412,7 @@ public class InMemoryStateInternals implements 
StateInternals {
 
 @Override
 public Iterable read() {
-  return contents;
+  return Iterables.limit(contents, contents.size());
 }
 
 @Override
@@ -478,7 +483,7 @@ public class InMemoryStateInternals implements 
StateInternals {
 
 @Override
 public Iterable read() {
-  return contents;
+  return ImmutableSet.copyOf(contents);
 }
 
 @Override
@@ -551,19 +556,41 @@ public class InMemoryStateInternals implements 
StateInternals {
   contents.remove(key);
 }
 
+private static class CollectionViewState implements 
ReadableState> {
+  private final Collection collection;
+
+  private CollectionViewState(Collection collection) {
+this.collection = collection;
+  }
+
+  public static  CollectionViewState of(Collection collection) {
+return new CollectionViewState<>(collection);
+  }
+
+  @Override
+  public Iterable read() {
+return ImmutableList.copyOf(collection);
+  }
+
+  @Override
+  public ReadableState> readLater() {
+return this;
+  }
+}
+
 @Override
 public ReadableState> keys() {
-  return ReadableStates.immediate((Iterable) contents.keySet());
+  return CollectionViewState.of(contents.keySet());
 }
 
 @Override
 public ReadableState> values() {
-  return ReadableStates.immediate((Iterable) contents.values());
+  return CollectionViewState.of(contents.values());
 }
 
 @Override
 public ReadableState>> entries() {
-  return ReadableStates.immediate((Iterable>) 
contents.entrySet());
+  return CollectionViewState.of(contents.entrySet());
 }
 
 @Override

http://git-wip-us.apache.org/repos/asf/beam/blob/77840fa3/runners/direct-java/src/test/java/org/apache/beam/runners/direct/CopyOnAccessInMemoryStateInternalsTest.java
--
diff --git 
a/runners/direct-java/src/test/java/org/apache/beam/runners/direct/CopyOnAccessInMemoryStateInternalsTest.java
 
b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/CopyOnAccessInMemoryStateInternalsTest.java
index 1e60ca3..657bb7f 100644
--- 
a/runners/direct-java/src/test/java/org/apache/beam/runners/direct/CopyOnAccessInMemoryStateInternalsTest.java
+++ 
b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/CopyOnAccessInMemoryStateInternalsTest

[1/2] beam git commit: This closes #3876

2017-09-21 Thread tgroh
Repository: beam
Updated Branches:
  refs/heads/master aed67731e -> 766513bc8


This closes #3876


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/766513bc
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/766513bc
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/766513bc

Branch: refs/heads/master
Commit: 766513bc8a2b977a5123d0657dc4b77bec0cfa73
Parents: aed6773 77840fa
Author: Thomas Groh 
Authored: Thu Sep 21 15:32:42 2017 -0700
Committer: Thomas Groh 
Committed: Thu Sep 21 15:32:42 2017 -0700

--
 .../runners/core/InMemoryStateInternals.java| 39 +--
 .../CopyOnAccessInMemoryStateInternalsTest.java | 74 +++-
 .../apache/beam/sdk/state/GroupingState.java| 12 +++-
 .../org/apache/beam/sdk/state/MapState.java | 20 +-
 .../apache/beam/sdk/state/ReadableState.java|  4 ++
 .../org/apache/beam/sdk/state/SetState.java | 10 ++-
 .../apache/beam/sdk/transforms/ParDoTest.java   | 44 +---
 7 files changed, 148 insertions(+), 55 deletions(-)
--




[jira] [Assigned] (BEAM-419) Non-transient non-serializable instance field in CombineFnUtil$NonSerializableBoundedKeyedCombineFn

2017-09-21 Thread Daniel Oliveira (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira reassigned BEAM-419:


Assignee: Daniel Oliveira

> Non-transient non-serializable instance field in 
> CombineFnUtil$NonSerializableBoundedKeyedCombineFn
> ---
>
> Key: BEAM-419
> URL: https://issues.apache.org/jira/browse/BEAM-419
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Daniel Oliveira
>Priority: Minor
>  Labels: findbugs, newbie, starter
>
> [FindBugs 
> SE_BAD_FIELD|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml#L363]:
>  Non-transient non-serializable instance field in serializable class
> Applies to: 
> [CombineFnUtil$NonSerializableBoundedKeyedCombineFn.context|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/core/src/main/java/org/apache/beam/sdk/util/CombineFnUtil.java#L170].
> This is a good starter bug. When fixing, please remove the corresponding 
> entries from 
> [findbugs-filter.xml|https://github.com/apache/incubator-beam/blob/master/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml]
>  and verify the build passes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: beam_PostCommit_Python_Verify #3185

2017-09-21 Thread Apache Jenkins Server
See 


--
[...truncated 44.46 KB...]
Collecting mock<3.0.0,>=1.0.1 (from apache-beam==2.2.0.dev0)
  Using cached mock-2.0.0-py2.py3-none-any.whl
Collecting oauth2client<4.0.0,>=2.0.1 (from apache-beam==2.2.0.dev0)
Collecting protobuf<=3.3.0,>=3.2.0 (from apache-beam==2.2.0.dev0)
  Using cached protobuf-3.3.0-cp27-cp27mu-manylinux1_x86_64.whl
Collecting pyyaml<4.0.0,>=3.12 (from apache-beam==2.2.0.dev0)
Collecting six<1.11,>=1.9 (from apache-beam==2.2.0.dev0)
  Using cached six-1.10.0-py2.py3-none-any.whl
Collecting typing<3.7.0,>=3.6.0 (from apache-beam==2.2.0.dev0)
  Using cached typing-3.6.2-py2-none-any.whl
Requirement already satisfied: enum34>=1.0.4 in 
./target/.tox/py27cython/lib/python2.7/site-packages (from 
grpcio<2.0,>=1.0->apache-beam==2.2.0.dev0)
Requirement already satisfied: futures>=2.2.0 in 
./target/.tox/py27cython/lib/python2.7/site-packages (from 
grpcio<2.0,>=1.0->apache-beam==2.2.0.dev0)
Collecting pbr>=0.11 (from mock<3.0.0,>=1.0.1->apache-beam==2.2.0.dev0)
  Using cached pbr-3.1.1-py2.py3-none-any.whl
Collecting funcsigs>=1; python_version < "3.3" (from 
mock<3.0.0,>=1.0.1->apache-beam==2.2.0.dev0)
  Using cached funcsigs-1.0.2-py2.py3-none-any.whl
Collecting pyasn1-modules>=0.0.5 (from 
oauth2client<4.0.0,>=2.0.1->apache-beam==2.2.0.dev0)
  Using cached pyasn1_modules-0.1.4-py2.py3-none-any.whl
Collecting pyasn1>=0.1.7 (from 
oauth2client<4.0.0,>=2.0.1->apache-beam==2.2.0.dev0)
  Using cached pyasn1-0.3.6-py2.py3-none-any.whl
Collecting rsa>=3.1.4 (from oauth2client<4.0.0,>=2.0.1->apache-beam==2.2.0.dev0)
  Using cached rsa-3.4.2-py2.py3-none-any.whl
Requirement already satisfied: setuptools in 
./target/.tox/py27cython/lib/python2.7/site-packages (from 
protobuf<=3.3.0,>=3.2.0->apache-beam==2.2.0.dev0)
Building wheels for collected packages: apache-beam
  Running setup.py bdist_wheel for apache-beam: started
  Running setup.py bdist_wheel for apache-beam: finished with status 'error'
  Complete output from command 

 -u -c "import setuptools, 
tokenize;__file__='/tmp/pip-bvaQvf-build/setup.py';f=getattr(tokenize, 'open', 
open)(__file__);code=f.read().replace('\r\n', 
'\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d 
/tmp/tmpgamts1pip-wheel- --python-tag cp27:
  
:351:
 UserWarning: Normalizing '2.2.0.dev' to '2.2.0.dev0'
normalized_version,
  running bdist_wheel
  running build
  running build_py
  Traceback (most recent call last):
File "", line 1, in 
File "/tmp/pip-bvaQvf-build/setup.py", line 203, in 
  'test': generate_protos_first(test),
File "/usr/lib/python2.7/distutils/core.py", line 151, in setup
  dist.run_commands()
File "/usr/lib/python2.7/distutils/dist.py", line 953, in run_commands
  self.run_command(cmd)
File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
  cmd_obj.run()
File 
"
 line 204, in run
  self.run_command('build')
File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
  self.distribution.run_command(command)
File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
  cmd_obj.run()
File "/usr/lib/python2.7/distutils/command/build.py", line 128, in run
  self.run_command(cmd_name)
File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
  self.distribution.run_command(command)
File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
  cmd_obj.run()
File "/tmp/pip-bvaQvf-build/setup.py", line 143, in run
  gen_protos.generate_proto_files()
File "gen_protos.py", line 65, in generate_proto_files
  'Not in apache git tree; unable to find proto definitions.')
  RuntimeError: Not in apache git tree; unable to find proto definitions.
  
  
  Failed building wheel for apache-beam
  Running setup.py clean for apache-beam
Failed to build apache-beam
Installing collected packages: avro, crcmod, dill, httplib2, pbr, funcsigs, 
six, mock, pyasn1, pyasn1-modules, rsa, oauth2client, protobuf, pyyaml, typing, 
apache-beam
  Found existing installation: six 1.11.0
Uninstalling six-1.11.0:
  Successfully uninstalled six-1.11.0
  Found existing installation: protobuf 3.4.0
Uninstalling protobuf-3.4.0:
  Successfully uninstalled protobuf-3.4.0
  Running setup.py install for apache-beam: started
Running setup.py install for apache-beam: finished with status 'error'
Complete output from comma

[jira] [Created] (BEAM-2977) improve unbounded prose in wordcount example

2017-09-21 Thread Melissa Pashniak (JIRA)
Melissa Pashniak created BEAM-2977:
--

 Summary: improve unbounded prose in wordcount example
 Key: BEAM-2977
 URL: https://issues.apache.org/jira/browse/BEAM-2977
 Project: Beam
  Issue Type: Improvement
  Components: website
Reporter: Melissa Pashniak
Assignee: Melissa Pashniak
Priority: Minor


request from tgroh: the way I think of unbounded is that it doesn't make sense 
to ask "do I have all of the data", just "at what point do I have all of the 
data up to"

context:
 ### Unbounded and bounded pipeline input modes
...
 If your input is continuously updating, then it's considered 'unbounded'.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[beam-site] branch asf-site updated (a13fc82 -> f478921)

2017-09-21 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


from a13fc82  Prepare repository for deployment.
 add b288b69  [BEAM-667] Verify and update wordcount snippets
 add e866d1d  Update with review feedback
 add 30d5e8f  Update with review feedback
 add 7e8a86b  This closes #311
 new f478921  Prepare repository for deployment.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/get-started/wordcount-example/index.html | 370 ---
 src/get-started/wordcount-example.md | 350 +++--
 2 files changed, 522 insertions(+), 198 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
['"commits@beam.apache.org" '].


[beam-site] 01/01: Prepare repository for deployment.

2017-09-21 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit f4789215c08599aabad6f60a626ccdea43ca03f2
Author: Mergebot 
AuthorDate: Thu Sep 21 21:03:52 2017 +

Prepare repository for deployment.
---
 content/get-started/wordcount-example/index.html | 370 ---
 1 file changed, 266 insertions(+), 104 deletions(-)

diff --git a/content/get-started/wordcount-example/index.html 
b/content/get-started/wordcount-example/index.html
index 513650c..5d6ea39 100644
--- a/content/get-started/wordcount-example/index.html
+++ b/content/get-started/wordcount-example/index.html
@@ -4,7 +4,7 @@
   
   
   
-  Beam WordCount Example
+  Beam WordCount Examples
   
   https://fonts.googleapis.com/css?family=Roboto:100,300,400"; 
rel="stylesheet">
@@ -143,65 +143,82 @@
 
 
 
-  Apache Beam WordCount Example
+  Apache Beam WordCount 
Examples
 
 
-  MinimalWordCount
-  Creating the Pipeline
-  Applying Pipeline 
Transforms
-  Running the Pipeline
+  MinimalWordCount example
+  Creating the pipeline
+  Applying pipeline 
transforms
+  Running the pipeline
 
   
-  WordCount Example
-  Specifying Explicit DoFns
-  Creating Composite 
Transforms
-  Using Parameterizable 
PipelineOptions
+  WordCount example
+  Specifying explicit DoFns
+  Creating composite 
transforms
+  Using parameterizable 
PipelineOptions
 
   
-  Debugging WordCount Example   
 
+  Debugging WordCount example   
 
   Logging
   Direct 
Runner
-  Dataflow Runner
+  Cloud Dataflow Runner
   Apache Spark Runner
   Apache Flink Runner
   Apache Apex Runner
 
   
-  Testing your Pipeline via 
PAssert
+  Testing your pipeline via 
PAssert
 
   
-  WindowedWordCount
+  WindowedWordCount example

   Unbounded and 
bounded pipeline input modes
-  Adding Timestamps to Data
+  Adding timestamps to data
   Windowing
   Reusing 
PTransforms over windowed PCollections
-  Write Results to an 
Unbounded Sink
+  Writing results to an 
unbounded sink
 
   
 
 
 
-  Adapt for: 
+  Adapt for:
   
 Java SDK
 Python SDK
   
 
 
-The WordCount examples demonstrate how to set up a processing pipeline that 
can read text, tokenize the text lines into individual words, and perform a 
frequency count on each of those words. The Beam SDKs contain a series of these 
four successively more detailed WordCount examples that build on each other. 
The input text for all the examples is a set of Shakespeare’s texts.
+The WordCount examples demonstrate how to set up a processing pipeline that 
can
+read text, tokenize the text lines into individual words, and perform a
+frequency count on each of those words. The Beam SDKs contain a series of these
+four successively more detailed WordCount examples that build on each other. 
The
+input text for all the examples is a set of Shakespeare’s texts.
 
-Each WordCount example introduces different concepts in the Beam 
programming model. Begin by understanding Minimal WordCount, the simplest of 
the examples. Once you feel comfortable with the basic principles in building a 
pipeline, continue on to learn more concepts in the other examples.
+Each WordCount example introduces different concepts in the Beam programming
+model. Begin by understanding Minimal WordCount, the simplest of the examples.
+Once you feel comfortable with the basic principles in building a pipeline,
+continue on to learn more concepts in the other examples.
 
 
-  Minimal WordCount demonstrates the basic principles 
involved in building a pipeline.
-  WordCount introduces some of the more common best 
practices in creating re-usable and maintainable pipelines.
+  Minimal WordCount demonstrates the basic principles 
involved in building a
+pipeline.
+  WordCount introduces some of the more common best 
practices in creating
+re-usable and maintainable pipelines.
   Debugging WordCount introduces logging and debugging 
practices.
-  Windowed WordCount demonstrates how you can use Beam’s 
programming model to handle both bounded and unbounded datasets.
+  Windowed WordCount demonstrates how you can use Beam’s 
programming model
+to handle both bounded and unbounded datasets.
 
 
-MinimalWordCount
+MinimalWordCount example
 
-Minimal WordCount demonstrates a simple pipeline that can read from a text 
file, apply transforms to tokenize and count the words, and write the data to 
an output text file. This example hard-codes the locations for its input and 
output files and doesn’t perform any error checking; it is intended to only 
show you the “bare bones” of creating a Beam pipeline. This lack of 
parameterization makes this particular pipeline less portable across different 
runners than standard Beam pipelines [...]

[beam-site] 03/04: Update with review feedback

2017-09-21 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 30d5e8ff774cbf26e9b27b05bd160e4ed95e1c4a
Author: melissa 
AuthorDate: Thu Sep 21 13:17:07 2017 -0700

Update with review feedback
---
 src/get-started/wordcount-example.md | 88 
 1 file changed, 48 insertions(+), 40 deletions(-)

diff --git a/src/get-started/wordcount-example.md 
b/src/get-started/wordcount-example.md
index 859fd3c..10fc220 100644
--- a/src/get-started/wordcount-example.md
+++ b/src/get-started/wordcount-example.md
@@ -1,11 +1,11 @@
 ---
 layout: default
-title: "Beam WordCount Example"
+title: "Beam WordCount Examples"
 permalink: get-started/wordcount-example/
 redirect_from: /use/wordcount-example/
 ---
 
-# Apache Beam WordCount Example
+# Apache Beam WordCount Examples
 
 * TOC
 {:toc}
@@ -37,7 +37,7 @@ continue on to learn more concepts in the other examples.
 * **Windowed WordCount** demonstrates how you can use Beam's programming model
   to handle both bounded and unbounded datasets.
 
-## MinimalWordCount
+## MinimalWordCount example
 
 Minimal WordCount demonstrates a simple pipeline that can read from a text 
file,
 apply transforms to tokenize and count the words, and write the data to an
@@ -147,7 +147,7 @@ To view the full code in Python, see
 The following sections explain these concepts in detail, using the relevant 
code
 excerpts from the Minimal WordCount pipeline.
 
-### Creating the Pipeline
+### Creating the pipeline
 
 In this example, the code first creates a `PipelineOptions` object. This object
 lets us set various options for our pipeline, such as the pipeline runner that
@@ -193,17 +193,17 @@ Pipeline p = Pipeline.create(options);
 {% github_sample 
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/snippets.py 
tag:examples_wordcount_minimal_create
 %}```
 
-### Applying Pipeline Transforms
+### Applying pipeline transforms
 
 The Minimal WordCount pipeline contains several transforms to read data into 
the
 pipeline, manipulate or otherwise transform the data, and write out the 
results.
-Each transform represents an operation in the pipeline.
+Transforms can consist of an individual operation, or can contain multiple
+nested transforms (which is a [composite transform]({{ site.baseurl 
}}/documentation/programming-guide#transforms-composite)).
 
-Each transform takes some kind of input (data or otherwise), and produces some
-output data. The input and output data is represented by the SDK class
-`PCollection`. `PCollection` is a special class, provided by the Beam SDK, that
-you can use to represent a data set of virtually any size, including unbounded
-data sets.
+Each transform takes some kind of input data and produces some output data. The
+input and output data is often represented by the SDK class `PCollection`.
+`PCollection` is a special class, provided by the Beam SDK, that you can use to
+represent a data set of virtually any size, including unbounded data sets.
 
 
 Figure 1: The pipeline data flow.
@@ -306,11 +306,10 @@ The Minimal WordCount pipeline contains five transforms:
 Note that the `Write` transform produces a trivial result value of type 
`PDone`,
 which in this case is ignored.
 
-### Running the Pipeline
+### Running the pipeline
 
 Run the pipeline by calling the `run` method, which sends your pipeline to be
-executed by the pipeline runner that you specified when you created your
-pipeline.
+executed by the pipeline runner that you specified in your `PipelineOptions`.
 
 ```java
 p.run().waitUntilFinish();
@@ -320,11 +319,12 @@ p.run().waitUntilFinish();
 {% github_sample 
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/snippets.py 
tag:examples_wordcount_minimal_run
 %}```
 
-Note that the `run` method is asynchronous. For a blocking execution, run your
-pipeline appending the `waitUntilFinish`
-`wait_until_finish` method.
+Note that the `run` method is asynchronous. For a blocking execution, call the
+`waitUntilFinish`
+`wait_until_finish` method on the result 
object
+returned by the call to `run`.
 
-## WordCount Example
+## WordCount example
 
 This WordCount example introduces a few recommended programming practices that
 can make your pipeline easier to read, write, and maintain. While not 
explicitly
@@ -333,7 +333,7 @@ your pipeline, and help make your pipeline's code reusable.
 
 This section assumes that you have a good understanding of the basic concepts 
in
 building a pipeline. If you feel that you aren't at that point yet, read the
-above section, [Minimal WordCount](#minimalwordcount).
+above section, [Minimal WordCount](#minimalwordcount-example).
 
 **To run this example in Java:**
 
@@ -431,7 +431,7 @@ To view the full code in Python, see
 The following sections explain these key concepts in detail, and break down the
 pipeline code into smaller sections

[beam-site] 04/04: This closes #311

2017-09-21 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 7e8a86b1863d818cb734a398ded89378797df78a
Merge: a13fc82 30d5e8f
Author: Mergebot 
AuthorDate: Thu Sep 21 21:00:52 2017 +

This closes #311

 src/get-started/wordcount-example.md | 350 +--
 1 file changed, 256 insertions(+), 94 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" .


[beam-site] 01/04: [BEAM-667] Verify and update wordcount snippets

2017-09-21 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit b288b69ecc9106728bf8f39fa7a27bd82e50f5c6
Author: melissa 
AuthorDate: Wed Aug 30 16:28:42 2017 -0700

[BEAM-667] Verify and update wordcount snippets
---
 src/get-started/wordcount-example.md | 61 ++--
 1 file changed, 31 insertions(+), 30 deletions(-)

diff --git a/src/get-started/wordcount-example.md 
b/src/get-started/wordcount-example.md
index cf0ebc0..371e6bb 100644
--- a/src/get-started/wordcount-example.md
+++ b/src/get-started/wordcount-example.md
@@ -11,7 +11,7 @@ redirect_from: /use/wordcount-example/
 {:toc}
 
 
-  Adapt for: 
+  Adapt for:
   
 Java SDK
 Python SDK
@@ -125,13 +125,13 @@ To view the full code in Python, see 
**[wordcount_minimal.py](https://github.com
 * Writing output (in this example: writing to a text file)
 * Running the Pipeline
 
-The following sections explain these concepts in detail along with excerpts of 
the relevant code from the Minimal WordCount pipeline.
+The following sections explain these concepts in detail, using the relevant 
code excerpts from the Minimal WordCount pipeline.
 
 ### Creating the Pipeline
 
-The first step in creating a Beam pipeline is to create a `PipelineOptions` 
object. This object lets us set various options for our pipeline, such as the 
pipeline runner that will execute our pipeline and any runner-specific 
configuration required by the chosen runner. In this example we set these 
options programmatically, but more often command-line arguments are used to set 
`PipelineOptions`. 
+The first step in creating a Beam pipeline is to create a `PipelineOptions` 
object. This object lets us set various options for our pipeline, such as the 
pipeline runner that will execute our pipeline and any runner-specific 
configuration required by the chosen runner. In this example we set these 
options programmatically, but more often, command-line arguments are used to 
set `PipelineOptions`.
 
-You can specify a runner for executing your pipeline, such as the 
`DataflowRunner` or `SparkRunner`. If you omit specifying a runner, as in this 
example, your pipeline will be executed locally using the `DirectRunner`. In 
the next sections, we will specify the pipeline's runner.
+You can specify a runner for executing your pipeline, such as the 
`DataflowRunner` or `SparkRunner`. If you omit specifying a runner, as in this 
example, your pipeline executes locally using the `DirectRunner`. In the next 
sections, we will specify the pipeline's runner.
 
 ```java
  PipelineOptions options = PipelineOptionsFactory.create();
@@ -154,7 +154,7 @@ You can specify a runner for executing your pipeline, such 
as the `DataflowRunne
 {% github_sample 
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/snippets.py 
tag:examples_wordcount_minimal_options
 %}```
 
-The next step is to create a Pipeline object with the options we've just 
constructed. The Pipeline object builds up the graph of transformations to be 
executed, associated with that particular pipeline.
+The next step is to create a `Pipeline` object with the options we've just 
constructed. The Pipeline object builds up the graph of transformations to be 
executed, associated with that particular pipeline.
 
 ```java
 Pipeline p = Pipeline.create(options);
@@ -175,7 +175,7 @@ Figure 1: The pipeline data flow.
 
 The Minimal WordCount pipeline contains five transforms:
 
-1.  A text file `Read` transform is applied to the Pipeline object itself, and 
produces a `PCollection` as output. Each element in the output PCollection 
represents one line of text from the input file. This example uses input data 
stored in a publicly accessible Google Cloud Storage bucket ("gs://").
+1.  A text file `Read` transform is applied to the `Pipeline` object itself, 
and produces a `PCollection` as output. Each element in the output 
`PCollection` represents one line of text from the input file. This example 
uses input data stored in a publicly accessible Google Cloud Storage bucket 
("gs://").
 
 ```java
 p.apply(TextIO.read().from("gs://apache-beam-samples/shakespeare/*"))
@@ -209,7 +209,7 @@ The Minimal WordCount pipeline contains five transforms:
 
 3.  The SDK-provided `Count` transform is a generic transform that takes a 
`PCollection` of any type, and returns a `PCollection` of key/value pairs. Each 
key represents a unique element from the input collection, and each value 
represents the number of times that key appeared in the input collection.
 
-   In this pipeline, the input for `Count` is the `PCollection` of 
individual words generated by the previous `ParDo`, and the output is a 
`PCollection` of key/value pairs where each key represents a unique word in the 
text and the associated value is the occurrence count for each.
+In this pipeline, the input 

[beam-site] branch mergebot updated (9bc6678 -> 7e8a86b)

2017-09-21 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


from 9bc6678  This closes #313
 add a13fc82  Prepare repository for deployment.
 new b288b69  [BEAM-667] Verify and update wordcount snippets
 new e866d1d  Update with review feedback
 new 30d5e8f  Update with review feedback
 new 7e8a86b  This closes #311

The 4 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../runners/capability-matrix/index.html   | 296 +
 src/get-started/wordcount-example.md   | 350 +++--
 2 files changed, 552 insertions(+), 94 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
['"commits@beam.apache.org" '].


[beam-site] 02/04: Update with review feedback

2017-09-21 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit e866d1dc11022a83493dd763bc4a4e9e41841a9f
Author: melissa 
AuthorDate: Mon Sep 11 10:49:47 2017 -0700

Update with review feedback
---
 src/get-started/wordcount-example.md | 281 +++
 1 file changed, 217 insertions(+), 64 deletions(-)

diff --git a/src/get-started/wordcount-example.md 
b/src/get-started/wordcount-example.md
index 371e6bb..859fd3c 100644
--- a/src/get-started/wordcount-example.md
+++ b/src/get-started/wordcount-example.md
@@ -18,18 +18,35 @@ redirect_from: /use/wordcount-example/
   
 
 
-The WordCount examples demonstrate how to set up a processing pipeline that 
can read text, tokenize the text lines into individual words, and perform a 
frequency count on each of those words. The Beam SDKs contain a series of these 
four successively more detailed WordCount examples that build on each other. 
The input text for all the examples is a set of Shakespeare's texts.
-
-Each WordCount example introduces different concepts in the Beam programming 
model. Begin by understanding Minimal WordCount, the simplest of the examples. 
Once you feel comfortable with the basic principles in building a pipeline, 
continue on to learn more concepts in the other examples.
-
-* **Minimal WordCount** demonstrates the basic principles involved in building 
a pipeline.
-* **WordCount** introduces some of the more common best practices in creating 
re-usable and maintainable pipelines.
+The WordCount examples demonstrate how to set up a processing pipeline that can
+read text, tokenize the text lines into individual words, and perform a
+frequency count on each of those words. The Beam SDKs contain a series of these
+four successively more detailed WordCount examples that build on each other. 
The
+input text for all the examples is a set of Shakespeare's texts.
+
+Each WordCount example introduces different concepts in the Beam programming
+model. Begin by understanding Minimal WordCount, the simplest of the examples.
+Once you feel comfortable with the basic principles in building a pipeline,
+continue on to learn more concepts in the other examples.
+
+* **Minimal WordCount** demonstrates the basic principles involved in building 
a
+  pipeline.
+* **WordCount** introduces some of the more common best practices in creating
+  re-usable and maintainable pipelines.
 * **Debugging WordCount** introduces logging and debugging practices.
-* **Windowed WordCount** demonstrates how you can use Beam's programming model 
to handle both bounded and unbounded datasets.
+* **Windowed WordCount** demonstrates how you can use Beam's programming model
+  to handle both bounded and unbounded datasets.
 
 ## MinimalWordCount
 
-Minimal WordCount demonstrates a simple pipeline that can read from a text 
file, apply transforms to tokenize and count the words, and write the data to 
an output text file. This example hard-codes the locations for its input and 
output files and doesn't perform any error checking; it is intended to only 
show you the "bare bones" of creating a Beam pipeline. This lack of 
parameterization makes this particular pipeline less portable across different 
runners than standard Beam pipelines. I [...]
+Minimal WordCount demonstrates a simple pipeline that can read from a text 
file,
+apply transforms to tokenize and count the words, and write the data to an
+output text file. This example hard-codes the locations for its input and 
output
+files and doesn't perform any error checking; it is intended to only show you
+the "bare bones" of creating a Beam pipeline. This lack of parameterization
+makes this particular pipeline less portable across different runners than
+standard Beam pipelines. In later examples, we will parameterize the pipeline's
+input and output sources and show other best practices.
 
 **To run this example in Java:**
 
@@ -73,7 +90,8 @@ $ mvn compile exec:java 
-Dexec.mainClass=org.apache.beam.examples.MinimalWordCou
  -Pdataflow-runner
 ```
 
-To view the full code in Java, see 
**[MinimalWordCount](https://github.com/apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/MinimalWordCount.java).**
+To view the full code in Java, see
+**[MinimalWordCount](https://github.com/apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/MinimalWordCount.java).**
 
 **To run this example in Python:**
 
@@ -113,7 +131,8 @@ python -m apache_beam.examples.wordcount_minimal --input 
gs://dataflow-samples/s
  --temp_location 
gs:///tmp/
 ```
 
-To view the full code in Python, see 
**[wordcount_minimal.py](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/wordcount_minimal.py).**
+To view the full code in Python, see
+**[wordcount_minimal.py](https://github.com

[jira] [Created] (BEAM-2976) investigate/add WordCount walkthrough testing section for Python

2017-09-21 Thread Melissa Pashniak (JIRA)
Melissa Pashniak created BEAM-2976:
--

 Summary: investigate/add WordCount walkthrough testing section for 
Python
 Key: BEAM-2976
 URL: https://issues.apache.org/jira/browse/BEAM-2976
 Project: Beam
  Issue Type: Improvement
  Components: website
Reporter: Melissa Pashniak
Assignee: Melissa Pashniak






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is unstable: beam_PostCommit_Java_MavenInstall #4843

2017-09-21 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : beam_PostCommit_Java_ValidatesRunner_Apex #2436

2017-09-21 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2973) Jenkins PreCommit broken due to missing dependency

2017-09-21 Thread Daniel Oliveira (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16175252#comment-16175252
 ] 

Daniel Oliveira commented on BEAM-2973:
---

No worries, it happens

> Jenkins PreCommit broken due to missing dependency
> --
>
> Key: BEAM-2973
> URL: https://issues.apache.org/jira/browse/BEAM-2973
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Blocker
> Fix For: Not applicable
>
>
> Jenkins seems to be failing nearly immediately this morning for multiple 
> builds. The error is:
> ERROR: Failed to parse POMs
> java.io.IOException: remote file operation failed: 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_MavenInstall at 
> hudson.remoting.Channel@7b1b4155:beam7: hudson.remoting.ProxyException: 
> hudson.maven.MavenModuleSetBuild$MavenExecutionException: 
> org.apache.maven.project.ProjectBuildingException: Some problems were 
> encountered while processing the POMs:
> [ERROR] 'dependencies.dependency.version' for 
> org.apache.beam:beam-runners-flink_2.10:jar is missing. @ line 68, column 21
> [WARNING] 'artifactId' contains an expression but should be a constant. @ 
> org.apache.beam:beam-runners-flink_${flink.scala.version}:2.2.0-SNAPSHOT, 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_MavenInstall/runners/flink/pom.xml,
>  line 29, column 15
> https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/14497/console
> I suspect this commit is causing the issue:
> https://github.com/apache/beam/commit/ab975317e1aa532053b68ccc105e13afff0c0b1a



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2973) Jenkins PreCommit broken due to missing dependency

2017-09-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16175247#comment-16175247
 ] 

ASF GitHub Bot commented on BEAM-2973:
--

Github user youngoli closed the pull request at:

https://github.com/apache/beam/pull/3873


> Jenkins PreCommit broken due to missing dependency
> --
>
> Key: BEAM-2973
> URL: https://issues.apache.org/jira/browse/BEAM-2973
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Blocker
> Fix For: Not applicable
>
>
> Jenkins seems to be failing nearly immediately this morning for multiple 
> builds. The error is:
> ERROR: Failed to parse POMs
> java.io.IOException: remote file operation failed: 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_MavenInstall at 
> hudson.remoting.Channel@7b1b4155:beam7: hudson.remoting.ProxyException: 
> hudson.maven.MavenModuleSetBuild$MavenExecutionException: 
> org.apache.maven.project.ProjectBuildingException: Some problems were 
> encountered while processing the POMs:
> [ERROR] 'dependencies.dependency.version' for 
> org.apache.beam:beam-runners-flink_2.10:jar is missing. @ line 68, column 21
> [WARNING] 'artifactId' contains an expression but should be a constant. @ 
> org.apache.beam:beam-runners-flink_${flink.scala.version}:2.2.0-SNAPSHOT, 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_MavenInstall/runners/flink/pom.xml,
>  line 29, column 15
> https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/14497/console
> I suspect this commit is causing the issue:
> https://github.com/apache/beam/commit/ab975317e1aa532053b68ccc105e13afff0c0b1a



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3873: [BEAM-2973] Reverting change to attempt to fix Jenk...

2017-09-21 Thread youngoli
Github user youngoli closed the pull request at:

https://github.com/apache/beam/pull/3873


---


[2/2] beam git commit: Execute windowing in Fn API runner.

2017-09-21 Thread robertwb
Execute windowing in Fn API runner.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/ccc32a25
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/ccc32a25
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/ccc32a25

Branch: refs/heads/master
Commit: ccc32a25fc139135b13c5fd14353377c4343e403
Parents: a92c45f
Author: Robert Bradshaw 
Authored: Thu Sep 14 00:30:10 2017 -0700
Committer: Robert Bradshaw 
Committed: Thu Sep 21 10:59:52 2017 -0700

--
 .../runners/portability/fn_api_runner.py| 52 ++--
 sdks/python/apache_beam/transforms/core.py  | 16 ++
 sdks/python/apache_beam/transforms/trigger.py   | 13 +
 3 files changed, 52 insertions(+), 29 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/ccc32a25/sdks/python/apache_beam/runners/portability/fn_api_runner.py
--
diff --git a/sdks/python/apache_beam/runners/portability/fn_api_runner.py 
b/sdks/python/apache_beam/runners/portability/fn_api_runner.py
index b0faa38..74bae11 100644
--- a/sdks/python/apache_beam/runners/portability/fn_api_runner.py
+++ b/sdks/python/apache_beam/runners/portability/fn_api_runner.py
@@ -46,6 +46,7 @@ from apache_beam.runners.worker import bundle_processor
 from apache_beam.runners.worker import data_plane
 from apache_beam.runners.worker import operation_specs
 from apache_beam.runners.worker import sdk_worker
+from apache_beam.transforms import trigger
 from apache_beam.transforms.window import GlobalWindows
 from apache_beam.utils import proto_utils
 from apache_beam.utils import urns
@@ -126,25 +127,31 @@ OLDE_SOURCE_SPLITTABLE_DOFN_DATA = pickler.dumps(
 
 class _GroupingBuffer(object):
   """Used to accumulate groupded (shuffled) results."""
-  def __init__(self, pre_grouped_coder, post_grouped_coder):
+  def __init__(self, pre_grouped_coder, post_grouped_coder, windowing):
 self._key_coder = pre_grouped_coder.key_coder()
 self._pre_grouped_coder = pre_grouped_coder
 self._post_grouped_coder = post_grouped_coder
 self._table = collections.defaultdict(list)
+self._windowing = windowing
 
   def append(self, elements_data):
 input_stream = create_InputStream(elements_data)
 while input_stream.size() > 0:
-  key, value = self._pre_grouped_coder.get_impl().decode_from_stream(
-  input_stream, True).value
-  self._table[self._key_coder.encode(key)].append(value)
+  windowed_key_value = self._pre_grouped_coder.get_impl(
+  ).decode_from_stream(input_stream, True)
+  key = windowed_key_value.value[0]
+  windowed_value = windowed_key_value.with_value(
+  windowed_key_value.value[1])
+  self._table[self._key_coder.encode(key)].append(windowed_value)
 
   def __iter__(self):
 output_stream = create_OutputStream()
-for encoded_key, values in self._table.items():
+trigger_driver = trigger.create_trigger_driver(self._windowing, True)
+for encoded_key, windowed_values in self._table.items():
   key = self._key_coder.decode(encoded_key)
-  self._post_grouped_coder.get_impl().encode_to_stream(
-  GlobalWindows.windowed_value((key, values)), output_stream, True)
+  for wkvs in trigger_driver.process_entire_key(key, windowed_values):
+self._post_grouped_coder.get_impl().encode_to_stream(
+wkvs, output_stream, True)
 return iter([output_stream.get()])
 
 
@@ -326,7 +333,7 @@ class 
FnApiRunner(maptask_executor_runner.MapTaskExecutorRunner):
   for stage in stages:
 assert len(stage.transforms) == 1
 transform = stage.transforms[0]
-if transform.spec.urn == urns.GROUP_BY_KEY_ONLY_TRANSFORM:
+if transform.spec.urn == urns.GROUP_BY_KEY_TRANSFORM:
   for pcoll_id in transform.inputs.values():
 fix_pcoll_coder(pipeline_components.pcollections[pcoll_id])
   for pcoll_id in transform.outputs.values():
@@ -608,11 +615,21 @@ class 
FnApiRunner(maptask_executor_runner.MapTaskExecutorRunner):
 pcoll.coder_id = coders.get_id(coder)
 coders.populate_map(pipeline_components.coders)
 
-# Initial set of stages are singleton transforms.
+known_composites = set([urns.GROUP_BY_KEY_TRANSFORM])
+
+def leaf_transforms(root_ids):
+  for root_id in root_ids:
+root = pipeline_proto.components.transforms[root_id]
+if root.spec.urn in known_composites or not root.subtransforms:
+  yield root_id
+else:
+  for leaf in leaf_transforms(root.subtransforms):
+yield leaf
+
+# Initial set of stages are singleton leaf transforms.
 stages = [
-Stage(name, [transform])
-for name, transform in pipeline_proto.components.transforms.items()
-if not transf

[GitHub] beam pull request #3877: Do windowing in the Fn API runner itself.

2017-09-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3877


---


[1/2] beam git commit: Closes #3877

2017-09-21 Thread robertwb
Repository: beam
Updated Branches:
  refs/heads/master a92c45f93 -> aed67731e


Closes #3877


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/aed67731
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/aed67731
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/aed67731

Branch: refs/heads/master
Commit: aed67731ea95404c82fdb52ddfdd610dfdc7cb07
Parents: a92c45f ccc32a2
Author: Robert Bradshaw 
Authored: Thu Sep 21 10:59:52 2017 -0700
Committer: Robert Bradshaw 
Committed: Thu Sep 21 10:59:52 2017 -0700

--
 .../runners/portability/fn_api_runner.py| 52 ++--
 sdks/python/apache_beam/transforms/core.py  | 16 ++
 sdks/python/apache_beam/transforms/trigger.py   | 13 +
 3 files changed, 52 insertions(+), 29 deletions(-)
--




Jenkins build is back to normal : beam_PostCommit_Python_Verify #3183

2017-09-21 Thread Apache Jenkins Server
See 




[jira] [Resolved] (BEAM-2084) Distribution metrics should be queriable in the Dataflow Runner

2017-09-21 Thread Pablo Estrada (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pablo Estrada resolved BEAM-2084.
-
   Resolution: Fixed
Fix Version/s: 2.0.0

> Distribution metrics should be queriable in the Dataflow Runner
> ---
>
> Key: BEAM-2084
> URL: https://issues.apache.org/jira/browse/BEAM-2084
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: beam_PostCommit_Python_Verify #3182

2017-09-21 Thread Apache Jenkins Server
See 


--
[...truncated 44.46 KB...]
Collecting mock<3.0.0,>=1.0.1 (from apache-beam==2.2.0.dev0)
  Using cached mock-2.0.0-py2.py3-none-any.whl
Collecting oauth2client<4.0.0,>=2.0.1 (from apache-beam==2.2.0.dev0)
Collecting protobuf<=3.3.0,>=3.2.0 (from apache-beam==2.2.0.dev0)
  Using cached protobuf-3.3.0-cp27-cp27mu-manylinux1_x86_64.whl
Collecting pyyaml<4.0.0,>=3.12 (from apache-beam==2.2.0.dev0)
Collecting six<1.11,>=1.9 (from apache-beam==2.2.0.dev0)
  Using cached six-1.10.0-py2.py3-none-any.whl
Collecting typing<3.7.0,>=3.6.0 (from apache-beam==2.2.0.dev0)
  Using cached typing-3.6.2-py2-none-any.whl
Requirement already satisfied: enum34>=1.0.4 in 
./target/.tox/py27cython/lib/python2.7/site-packages (from 
grpcio<2.0,>=1.0->apache-beam==2.2.0.dev0)
Requirement already satisfied: futures>=2.2.0 in 
./target/.tox/py27cython/lib/python2.7/site-packages (from 
grpcio<2.0,>=1.0->apache-beam==2.2.0.dev0)
Collecting pbr>=0.11 (from mock<3.0.0,>=1.0.1->apache-beam==2.2.0.dev0)
  Using cached pbr-3.1.1-py2.py3-none-any.whl
Collecting funcsigs>=1; python_version < "3.3" (from 
mock<3.0.0,>=1.0.1->apache-beam==2.2.0.dev0)
  Using cached funcsigs-1.0.2-py2.py3-none-any.whl
Collecting pyasn1>=0.1.7 (from 
oauth2client<4.0.0,>=2.0.1->apache-beam==2.2.0.dev0)
  Using cached pyasn1-0.3.6-py2.py3-none-any.whl
Collecting rsa>=3.1.4 (from oauth2client<4.0.0,>=2.0.1->apache-beam==2.2.0.dev0)
  Using cached rsa-3.4.2-py2.py3-none-any.whl
Collecting pyasn1-modules>=0.0.5 (from 
oauth2client<4.0.0,>=2.0.1->apache-beam==2.2.0.dev0)
  Using cached pyasn1_modules-0.1.4-py2.py3-none-any.whl
Requirement already satisfied: setuptools in 
./target/.tox/py27cython/lib/python2.7/site-packages (from 
protobuf<=3.3.0,>=3.2.0->apache-beam==2.2.0.dev0)
Building wheels for collected packages: apache-beam
  Running setup.py bdist_wheel for apache-beam: started
  Running setup.py bdist_wheel for apache-beam: finished with status 'error'
  Complete output from command 

 -u -c "import setuptools, 
tokenize;__file__='/tmp/pip-JCPVVT-build/setup.py';f=getattr(tokenize, 'open', 
open)(__file__);code=f.read().replace('\r\n', 
'\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d 
/tmp/tmpyE9lrgpip-wheel- --python-tag cp27:
  
:351:
 UserWarning: Normalizing '2.2.0.dev' to '2.2.0.dev0'
normalized_version,
  running bdist_wheel
  running build
  running build_py
  Traceback (most recent call last):
File "", line 1, in 
File "/tmp/pip-JCPVVT-build/setup.py", line 203, in 
  'test': generate_protos_first(test),
File "/usr/lib/python2.7/distutils/core.py", line 151, in setup
  dist.run_commands()
File "/usr/lib/python2.7/distutils/dist.py", line 953, in run_commands
  self.run_command(cmd)
File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
  cmd_obj.run()
File 
"
 line 204, in run
  self.run_command('build')
File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
  self.distribution.run_command(command)
File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
  cmd_obj.run()
File "/usr/lib/python2.7/distutils/command/build.py", line 128, in run
  self.run_command(cmd_name)
File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
  self.distribution.run_command(command)
File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
  cmd_obj.run()
File "/tmp/pip-JCPVVT-build/setup.py", line 143, in run
  gen_protos.generate_proto_files()
File "gen_protos.py", line 65, in generate_proto_files
  'Not in apache git tree; unable to find proto definitions.')
  RuntimeError: Not in apache git tree; unable to find proto definitions.
  
  
  Failed building wheel for apache-beam
  Running setup.py clean for apache-beam
Failed to build apache-beam
Installing collected packages: avro, crcmod, dill, httplib2, pbr, six, 
funcsigs, mock, pyasn1, rsa, pyasn1-modules, oauth2client, protobuf, pyyaml, 
typing, apache-beam
  Found existing installation: six 1.11.0
Uninstalling six-1.11.0:
  Successfully uninstalled six-1.11.0
  Found existing installation: protobuf 3.4.0
Uninstalling protobuf-3.4.0:
  Successfully uninstalled protobuf-3.4.0
  Running setup.py install for apache-beam: started
Running setup.py install for apache-beam: finished with status 'error'
Complete output from comma

[jira] [Commented] (BEAM-793) JdbcIO can create a deadlock when parallelism is greater than 1

2017-09-21 Thread Guillaume Balaine (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174736#comment-16174736
 ] 

Guillaume Balaine commented on BEAM-793:


Alright, I made it worked by having a backoff strategy and rebuilding the batch 
every time. It's likely not the best solution, but it guarantees writes, which 
is good for bounded pipelines.


{code:java}
  private void processRecord(T record) throws RuntimeException {
try {
  preparedStatement.clearParameters();
  spec.getPreparedStatementSetter().setParameters(record, 
preparedStatement);
  preparedStatement.addBatch();
} catch (Exception e) {
  throw new RuntimeException(e);
}
  }
private void executeBatch() throws SQLException, IOException, 
InterruptedException {
LOG.info("Writing bundle {} batch of {} statements", 
this.bundleUUID.toString(), batchCount);
Sleeper sleeper = Sleeper.DEFAULT;
BackOff backoff = BUNDLE_WRITE_BACKOFF.backoff();
while (true) {
  // Batch upsert entities.
  try {
int[] updates = preparedStatement.executeBatch();
connection.commit();
LOG.info("Successfully wrote {} statements of bundle {}", 
updates.length, this.bundleUUID.toString());
// Break if the commit threw no exception.
break;
  } catch (SQLException exception) {
LOG.error("Error writing bundle {} to the Database ({}): {}", 
this.bundleUUID.toString(),
exception.getErrorCode(), exception.getMessage());
if (!BackOffUtils.next(sleeper, backoff)) {
  LOG.error("Aborting bundle {} after {} retries.", 
this.bundleUUID.toString(), MAX_RETRIES);
  throw exception;
} else {
  records.stream().forEach(this::processRecord);
}
  }
}
clearBatch();
  }
 private void clearBatch() {
batchCount = 0;
records.clear();
  }
{code}


> JdbcIO can create a deadlock when parallelism is greater than 1
> ---
>
> Key: BEAM-793
> URL: https://issues.apache.org/jira/browse/BEAM-793
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
> Fix For: 2.2.0
>
>
> With the following JdbcIO configuration, if the parallelism is greater than 
> 1, we can have a {{Deadlock found when trying to get lock; try restarting 
> transaction}}.
> {code}
> MysqlDataSource dbCfg = new MysqlDataSource();
> dbCfg.setDatabaseName("db");
> dbCfg.setUser("user");
> dbCfg.setPassword("pass");
> dbCfg.setServerName("localhost");
> dbCfg.setPortNumber(3306);
> p.apply(Create.of(data))
> .apply(JdbcIO. Long>>write()
> 
> .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(dbCfg))
> .withStatement("INSERT INTO 
> smth(loc,event_type,hash,begin_date,end_date) VALUES(?, ?, ?, ?, ?) ON 
> DUPLICATE KEY UPDATE event_type=VALUES(event_type),end_date=VALUES(end_date)")
> .withPreparedStatementSetter(new 
> JdbcIO.PreparedStatementSetter Long>>() {
> public void setParameters(Tuple5 Integer, ByteString, Long, Long> element, PreparedStatement statement)
> throws Exception {
> statement.setInt(1, element.f0);
> statement.setInt(2, element.f1);
> statement.setBytes(3, 
> element.f2.toByteArray());
> statement.setLong(4, element.f3);
> statement.setLong(5, element.f4);
> }
> }));
> {code}
> This can happen due to the {{autocommit}}. I'm going to investigate.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Reopened] (BEAM-2377) Cross compile flink runner to scala 2.11

2017-09-21 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek reopened BEAM-2377:


Commit was reverted.

> Cross compile flink runner to scala 2.11
> 
>
> Key: BEAM-2377
> URL: https://issues.apache.org/jira/browse/BEAM-2377
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Ole Langbehn
>Assignee: Aljoscha Krettek
> Fix For: 2.2.0
>
>
> The flink runner is compiled for flink built against scala 2.10. flink cross 
> compiles its scala artifacts against 2.10 and 2.11.
> In order to make it possible to use beam with the flink runner in scala 2.11 
> projects, it would be nice if you could publish the flink runner for 2.11 
> next to 2.10.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (BEAM-2973) Jenkins PreCommit broken due to missing dependency

2017-09-21 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek closed BEAM-2973.
--
   Resolution: Fixed
Fix Version/s: Not applicable

This was fixed in the revert commit e1548435c45b1e4b349f55df1e37e1b6de8fc500.

Sorry for the inconvenience (I merged the PR that caused the breakage).

> Jenkins PreCommit broken due to missing dependency
> --
>
> Key: BEAM-2973
> URL: https://issues.apache.org/jira/browse/BEAM-2973
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Blocker
> Fix For: Not applicable
>
>
> Jenkins seems to be failing nearly immediately this morning for multiple 
> builds. The error is:
> ERROR: Failed to parse POMs
> java.io.IOException: remote file operation failed: 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_MavenInstall at 
> hudson.remoting.Channel@7b1b4155:beam7: hudson.remoting.ProxyException: 
> hudson.maven.MavenModuleSetBuild$MavenExecutionException: 
> org.apache.maven.project.ProjectBuildingException: Some problems were 
> encountered while processing the POMs:
> [ERROR] 'dependencies.dependency.version' for 
> org.apache.beam:beam-runners-flink_2.10:jar is missing. @ line 68, column 21
> [WARNING] 'artifactId' contains an expression but should be a constant. @ 
> org.apache.beam:beam-runners-flink_${flink.scala.version}:2.2.0-SNAPSHOT, 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_MavenInstall/runners/flink/pom.xml,
>  line 29, column 15
> https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/14497/console
> I suspect this commit is causing the issue:
> https://github.com/apache/beam/commit/ab975317e1aa532053b68ccc105e13afff0c0b1a



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2377) Cross compile flink runner to scala 2.11

2017-09-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174641#comment-16174641
 ] 

ASF GitHub Bot commented on BEAM-2377:
--

GitHub user aljoscha opened a pull request:

https://github.com/apache/beam/pull/3881

[BEAM-2377] Allow cross compilation (2.10,2.11) for flink runner

This is a fixed version of #3255 which was reverted in #3875. The problem 
was that `flink.scala.version` was not set in the nexmark pom.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aljoscha/beam fix-pr-3255-scala-version

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3881.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3881


commit f61605b4199c6904a6dcbd111ffb906491fa9e0e
Author: Ole Langbehn 
Date:   2017-05-31T07:54:04Z

[BEAM-2377] Allow cross compilation (2.10,2.11) for flink runner

Flink allows being built against scala 2.11. But the Flink Runner did
not.

This commit alleviates that, as well as allowing for ensuring that
builds work against scala 2.11 dependencies. It introduces a
flink.scala.version mvn property that is set to 2.11 as a default, as well 
as
a mvn profile that overrides the scala version to 2.10.




> Cross compile flink runner to scala 2.11
> 
>
> Key: BEAM-2377
> URL: https://issues.apache.org/jira/browse/BEAM-2377
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Ole Langbehn
>Assignee: Aljoscha Krettek
> Fix For: 2.2.0
>
>
> The flink runner is compiled for flink built against scala 2.10. flink cross 
> compiles its scala artifacts against 2.10 and 2.11.
> In order to make it possible to use beam with the flink runner in scala 2.11 
> projects, it would be nice if you could publish the flink runner for 2.11 
> next to 2.10.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3881: [BEAM-2377] Allow cross compilation (2.10,2.11) for...

2017-09-21 Thread aljoscha
GitHub user aljoscha opened a pull request:

https://github.com/apache/beam/pull/3881

[BEAM-2377] Allow cross compilation (2.10,2.11) for flink runner

This is a fixed version of #3255 which was reverted in #3875. The problem 
was that `flink.scala.version` was not set in the nexmark pom.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aljoscha/beam fix-pr-3255-scala-version

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3881.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3881


commit f61605b4199c6904a6dcbd111ffb906491fa9e0e
Author: Ole Langbehn 
Date:   2017-05-31T07:54:04Z

[BEAM-2377] Allow cross compilation (2.10,2.11) for flink runner

Flink allows being built against scala 2.11. But the Flink Runner did
not.

This commit alleviates that, as well as allowing for ensuring that
builds work against scala 2.11 dependencies. It introduces a
flink.scala.version mvn property that is set to 2.11 as a default, as well 
as
a mvn profile that overrides the scala version to 2.10.




---


Build failed in Jenkins: beam_PostCommit_Java_MavenInstall #4842

2017-09-21 Thread Apache Jenkins Server
See 


--
[...truncated 562.69 KB...]
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
2017-09-21T12:08:50.378 [INFO] 

2017-09-21T12:08:50.378 [INFO] Reactor Summary:
2017-09-21T12:08:50.378 [INFO] 
2017-09-21T12:08:50.378 [INFO] Apache Beam :: Parent 
.. SUCCESS [ 42.755 s]
2017-09-21T12:08:50.378 [INFO] Apache Beam :: SDKs :: Java :: Build Tools 
. SUCCESS [ 16.415 s]
2017-09-21T12:08:50.378 [INFO] Apache Beam :: SDKs 
 SUCCESS [  5.740 s]
2017-09-21T12:08:50.378 [INFO] Apache Beam :: SDKs :: Common 
.. SUCCESS [  3.346 s]
2017-09-21T12:08:50.378 [INFO] Apache Beam :: SDKs :: Common :: Runner API 
 SUCCESS [ 39.344 s]
2017-09-21T12:08:50.378 [INFO] Apache Beam :: SDKs :: Common :: Fn API 
 SUCCESS [ 21.995 s]
2017-09-21T12:08:50.378 [INFO] Apache Beam :: SDKs :: Java 
 SUCCESS [  3.259 s]
2017-09-21T12:08:50.379 [INFO] Apache Beam :: SDKs :: Java :: Core 
 SUCCESS [02:31 min]
2017-09-21T12:08:50.379 [INFO] Apache Beam :: Runners 
. SUCCESS [  3.004 s]
2017-09-21T12:08:50.379 [INFO] Apache Beam :: Runners :: Core Construction Java 
... SUCCESS [ 37.208 s]
2017-09-21T12:08:50.379 [INFO] Apache Beam :: Runners :: Core Java 
 SUCCESS [ 51.037 s]
2017-09-21T12:08:50.379 [INFO] Apache Beam :: Runners :: Direct Java 
.. FAILURE [  0.426 s]
2017-09-21T12:08:50.379 [INFO] Apache Beam :: SDKs :: Java :: IO 
.. SKIPPED
2017-09-21T12:08:50.379 [INFO] Apache Beam :: SDKs :: Java :: IO :: AMQP 
.. SKIPPED
2017-09-21T12:08:50.379 [INFO] Apache Beam :: SDKs :: Java :: IO :: Common 
 SKIPPED
2017-09-21T12:08:50.379 [INFO] Apache Beam :: SDKs :: Java :: IO :: Cassandra 
. SKIPPED
2017-09-21T12:08:50.379 [INFO] Apache Beam :: SDKs :: Java :: IO :: 
Elasticsearch . SKIPPED
2017-09-21T12:08:50.379 [INFO] Apache Beam :: SDKs :: Java :: IO :: 
Elasticsearch-Tests SKIPPED
2017-09-21T12:08:50.379 [INFO] Apache Beam :: SDKs :: Java :: IO :: 
Elasticsearch-Tests :: Common SKIPPED
2017-09-21T12:08:50.379 [INFO] Apache Beam :: SDKs :: Java :: IO :: 
Elasticsearch-Tests :: 2.x SKIPPED
2017-09-21T12:08:50.379 [INFO] Apache Beam :: SDKs :: Java :: IO :: 
Elasticsearch-Tests :: 5.x SKIPPED
2017-09-21T12:08:50.379 [INFO] Apache Beam :: SDKs :: Java :: Extensions 
.. SKIPPED
2017-09-21T12:08:50.379 [INFO] Apache Beam :: SDKs :: Java :: Extensions :: 
Google Cloud Platform Core SKIPPED
2017-09-21T12:08:50.379 [INFO] Apache Beam :: SDKs :: Java :: Extensions :: 
Protobuf SKIPPED
2017-09-21T12:08:50.379 [INFO] Apache Beam :: SDKs :: Java :: IO :: Google 
Cloud Platform SKIPPED
2017-09-21T12:08:50.379 [INFO] Apache Beam :: SDKs :: Java :: IO :: Hadoop 
Common . SKIPPED
2017-09-21T12:08:50.379 [INFO] Apache Beam :: SDKs :: Java :: IO :: Hadoop File 
System SKIPPED
2017-09-21T12:08:50.379 [INFO] Apache Beam :: SDKs :: Java :: IO :: Hadoop 
 SKIPPED
2017-09-21T12:08:50.379 [INFO] Apache Beam :: SDKs :: Java :: IO :: Hadoop :: 
input-format SKIPPED
2017-09-21T12:08:50.379 [INFO] Apache Beam :: Runners :: Google Cloud Dataflow 
 SKIPPED
2017-09-21T12:08:50.379 [INFO] Apache Beam :: SDKs :: Java :: IO :: Hadoo

Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Apex #2435

2017-09-21 Thread Apache Jenkins Server
See 


--
[...truncated 315.00 KB...]
2017-09-21T12:03:52.535 [INFO] Excluding 
org.apache.beam:beam-sdks-common-runner-api:jar:2.2.0-SNAPSHOT from the shaded 
jar.
2017-09-21T12:03:52.535 [INFO] Including com.google.guava:guava:jar:20.0 in the 
shaded jar.
2017-09-21T12:03:52.535 [INFO] Excluding 
com.google.protobuf:protobuf-java:jar:3.2.0 from the shaded jar.
2017-09-21T12:03:52.535 [INFO] Excluding io.grpc:grpc-core:jar:1.2.0 from the 
shaded jar.
2017-09-21T12:03:52.535 [INFO] Excluding 
com.google.errorprone:error_prone_annotations:jar:2.0.15 from the shaded jar.
2017-09-21T12:03:52.535 [INFO] Excluding 
com.google.code.findbugs:jsr305:jar:3.0.1 from the shaded jar.
2017-09-21T12:03:52.535 [INFO] Excluding io.grpc:grpc-context:jar:1.2.0 from 
the shaded jar.
2017-09-21T12:03:52.535 [INFO] Excluding 
com.google.instrumentation:instrumentation-api:jar:0.3.0 from the shaded jar.
2017-09-21T12:03:52.535 [INFO] Excluding io.grpc:grpc-protobuf:jar:1.2.0 from 
the shaded jar.
2017-09-21T12:03:52.535 [INFO] Excluding 
com.google.protobuf:protobuf-java-util:jar:3.2.0 from the shaded jar.
2017-09-21T12:03:52.535 [INFO] Excluding com.google.code.gson:gson:jar:2.7 from 
the shaded jar.
2017-09-21T12:03:52.535 [INFO] Excluding io.grpc:grpc-protobuf-lite:jar:1.2.0 
from the shaded jar.
2017-09-21T12:03:52.535 [INFO] Excluding io.grpc:grpc-stub:jar:1.2.0 from the 
shaded jar.
2017-09-21T12:03:55.222 [INFO] Replacing original artifact with shaded artifact.
2017-09-21T12:03:55.222 [INFO] Replacing 

 with 

2017-09-21T12:03:55.222 [INFO] Replacing original test artifact with shaded 
test artifact.
2017-09-21T12:03:55.223 [INFO] Replacing 

 with 

2017-09-21T12:03:55.488 [INFO] 
2017-09-21T12:03:55.489 [INFO] --- maven-dependency-plugin:3.0.1:analyze-only 
(default) @ beam-sdks-common-fn-api ---
2017-09-21T12:03:55.609 [INFO] No dependency problems found
[JENKINS] Archiving disabled
2017-09-21T12:03:57.315 [INFO]  
   
2017-09-21T12:03:57.315 [INFO] 

2017-09-21T12:03:57.315 [INFO] Building Apache Beam :: SDKs :: Java 
2.2.0-SNAPSHOT
2017-09-21T12:03:57.315 [INFO] 

2017-09-21T12:03:57.318 [INFO] 
2017-09-21T12:03:57.319 [INFO] --- maven-clean-plugin:3.0.0:clean 
(default-clean) @ beam-sdks-java-parent ---
2017-09-21T12:03:57.325 [INFO] Deleting 

 (includes = [**/*.pyc, **/*.egg-info/, **/sdks/python/LICENSE, 
**/sdks/python/NOTICE, **/sdks/python/README.md], excludes = [])
2017-09-21T12:03:57.552 [INFO] 
2017-09-21T12:03:57.552 [INFO] --- maven-enforcer-plugin:3.0.0-M1:enforce 
(enforce) @ beam-sdks-java-parent ---
2017-09-21T12:03:57.660 [INFO] 
2017-09-21T12:03:57.660 [INFO] --- maven-enforcer-plugin:3.0.0-M1:enforce 
(enforce-banned-dependencies) @ beam-sdks-java-parent ---
2017-09-21T12:03:57.770 [INFO] 
2017-09-21T12:03:57.770 [INFO] --- maven-remote-resources-plugin:1.5:process 
(process-resource-bundles) @ beam-sdks-java-parent ---
2017-09-21T12:03:57.900 [INFO] 
2017-09-21T12:03:57.900 [INFO] --- maven-checkstyle-plugin:2.17:check (default) 
@ beam-sdks-java-parent ---
2017-09-21T12:03:58.697 [INFO] Starting audit...
Audit done.
2017-09-21T12:03:58.804 [INFO] 
2017-09-21T12:03:58.804 [INFO] --- 
build-helper-maven-plugin:3.0.0:regex-properties (render-artifact-id) @ 
beam-sdks-java-parent ---
2017-09-21T12:03:58.912 [INFO] 
2017-09-21T12:03:58.912 [INFO] --- maven-site-plugin:3.5.1:attach-descriptor 
(attach-descriptor) @ beam-sdks-java-parent ---
2017-09-21T12:03:59.139 [INFO] 
2017-09-21T12:03:59.139 [INFO] --- maven-jar-plugin:3.0.2:jar (default-jar) @ 
beam-sdks-java-parent ---
2017-09-21T12:03:59.140 [INFO] Skipping packaging of the jar
2017-09-21T12:03:59.364 [INFO] 
2017-09-21T12:03:59.364 [INFO] --- maven-jar-plugin:3.0.2:test-jar 
(default-test-jar) @ beam-sdks-java-parent ---
2017-09-21T12:03:59.365 [INFO] Skipping packaging of the test-jar
2017-09-21T12:03:59.472 [INFO] 
2017-09-21T12:03:59.472 [INFO] --- maven-shade-plugin:3.0.0:shade 
(bundle-and-repackage) @ beam-sdks-java-p

[GitHub] beam pull request #3880: JStorm-runner: Add unit tests for JStorm runner, op...

2017-09-21 Thread bastiliu
GitHub user bastiliu opened a pull request:

https://github.com/apache/beam/pull/3880

JStorm-runner: Add unit tests for JStorm runner, options and pipeline…

… translator.

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bastiliu/beam jstorm-runner

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3880.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3880


commit d05bf56680eb12d3a9d0c96cca4efbc05d423868
Author: basti.lj 
Date:   2017-09-21T09:20:50Z

JStorm-runner: Add unit tests for JStorm runner, options and pipeline 
translator.




---


Build failed in Jenkins: beam_PostCommit_Python_Verify #3181

2017-09-21 Thread Apache Jenkins Server
See 


--
[...truncated 44.46 KB...]
Collecting mock<3.0.0,>=1.0.1 (from apache-beam==2.2.0.dev0)
  Using cached mock-2.0.0-py2.py3-none-any.whl
Collecting oauth2client<4.0.0,>=2.0.1 (from apache-beam==2.2.0.dev0)
Collecting protobuf<=3.3.0,>=3.2.0 (from apache-beam==2.2.0.dev0)
  Using cached protobuf-3.3.0-cp27-cp27mu-manylinux1_x86_64.whl
Collecting pyyaml<4.0.0,>=3.12 (from apache-beam==2.2.0.dev0)
Collecting six<1.11,>=1.9 (from apache-beam==2.2.0.dev0)
  Using cached six-1.10.0-py2.py3-none-any.whl
Collecting typing<3.7.0,>=3.6.0 (from apache-beam==2.2.0.dev0)
  Using cached typing-3.6.2-py2-none-any.whl
Requirement already satisfied: enum34>=1.0.4 in 
./target/.tox/py27cython/lib/python2.7/site-packages (from 
grpcio<2.0,>=1.0->apache-beam==2.2.0.dev0)
Requirement already satisfied: futures>=2.2.0 in 
./target/.tox/py27cython/lib/python2.7/site-packages (from 
grpcio<2.0,>=1.0->apache-beam==2.2.0.dev0)
Collecting funcsigs>=1; python_version < "3.3" (from 
mock<3.0.0,>=1.0.1->apache-beam==2.2.0.dev0)
  Using cached funcsigs-1.0.2-py2.py3-none-any.whl
Collecting pbr>=0.11 (from mock<3.0.0,>=1.0.1->apache-beam==2.2.0.dev0)
  Using cached pbr-3.1.1-py2.py3-none-any.whl
Collecting pyasn1>=0.1.7 (from 
oauth2client<4.0.0,>=2.0.1->apache-beam==2.2.0.dev0)
  Using cached pyasn1-0.3.6-py2.py3-none-any.whl
Collecting rsa>=3.1.4 (from oauth2client<4.0.0,>=2.0.1->apache-beam==2.2.0.dev0)
  Using cached rsa-3.4.2-py2.py3-none-any.whl
Collecting pyasn1-modules>=0.0.5 (from 
oauth2client<4.0.0,>=2.0.1->apache-beam==2.2.0.dev0)
  Using cached pyasn1_modules-0.1.4-py2.py3-none-any.whl
Requirement already satisfied: setuptools in 
./target/.tox/py27cython/lib/python2.7/site-packages (from 
protobuf<=3.3.0,>=3.2.0->apache-beam==2.2.0.dev0)
Building wheels for collected packages: apache-beam
  Running setup.py bdist_wheel for apache-beam: started
  Running setup.py bdist_wheel for apache-beam: finished with status 'error'
  Complete output from command 

 -u -c "import setuptools, 
tokenize;__file__='/tmp/pip-3JeUsd-build/setup.py';f=getattr(tokenize, 'open', 
open)(__file__);code=f.read().replace('\r\n', 
'\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d 
/tmp/tmpIQlWnvpip-wheel- --python-tag cp27:
  
:351:
 UserWarning: Normalizing '2.2.0.dev' to '2.2.0.dev0'
normalized_version,
  running bdist_wheel
  running build
  running build_py
  Traceback (most recent call last):
File "", line 1, in 
File "/tmp/pip-3JeUsd-build/setup.py", line 203, in 
  'test': generate_protos_first(test),
File "/usr/lib/python2.7/distutils/core.py", line 151, in setup
  dist.run_commands()
File "/usr/lib/python2.7/distutils/dist.py", line 953, in run_commands
  self.run_command(cmd)
File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
  cmd_obj.run()
File 
"
 line 204, in run
  self.run_command('build')
File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
  self.distribution.run_command(command)
File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
  cmd_obj.run()
File "/usr/lib/python2.7/distutils/command/build.py", line 128, in run
  self.run_command(cmd_name)
File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
  self.distribution.run_command(command)
File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
  cmd_obj.run()
File "/tmp/pip-3JeUsd-build/setup.py", line 143, in run
  gen_protos.generate_proto_files()
File "gen_protos.py", line 65, in generate_proto_files
  'Not in apache git tree; unable to find proto definitions.')
  RuntimeError: Not in apache git tree; unable to find proto definitions.
  
  
  Failed building wheel for apache-beam
  Running setup.py clean for apache-beam
Failed to build apache-beam
Installing collected packages: avro, crcmod, dill, httplib2, funcsigs, pbr, 
six, mock, pyasn1, rsa, pyasn1-modules, oauth2client, protobuf, pyyaml, typing, 
apache-beam
  Found existing installation: six 1.11.0
Uninstalling six-1.11.0:
  Successfully uninstalled six-1.11.0
  Found existing installation: protobuf 3.4.0
Uninstalling protobuf-3.4.0:
  Successfully uninstalled protobuf-3.4.0
  Running setup.py install for apache-beam: started
Running setup.py install for apache-beam: finished with status 'error'
Complete output from comma

svn commit: r21730 - in /dev/beam/2.1.1: ./ apache-beam-2.1.1-python.zip apache-beam-2.1.1-python.zip.asc apache-beam-2.1.1-python.zip.md5 apache-beam-2.1.1-python.zip.sha1 apache-beam-2.1.1-python.zi

2017-09-21 Thread robertwb
Author: robertwb
Date: Thu Sep 21 07:47:16 2017
New Revision: 21730

Log:
Stage beam 2.1.1 release.

Added:
dev/beam/2.1.1/
dev/beam/2.1.1/apache-beam-2.1.1-python.zip   (with props)
dev/beam/2.1.1/apache-beam-2.1.1-python.zip.asc
dev/beam/2.1.1/apache-beam-2.1.1-python.zip.md5
dev/beam/2.1.1/apache-beam-2.1.1-python.zip.sha1
dev/beam/2.1.1/apache-beam-2.1.1-python.zip.sha256

Added: dev/beam/2.1.1/apache-beam-2.1.1-python.zip
==
Binary file - no diff available.

Propchange: dev/beam/2.1.1/apache-beam-2.1.1-python.zip
--
svn:mime-type = application/octet-stream

Added: dev/beam/2.1.1/apache-beam-2.1.1-python.zip.asc
==
--- dev/beam/2.1.1/apache-beam-2.1.1-python.zip.asc (added)
+++ dev/beam/2.1.1/apache-beam-2.1.1-python.zip.asc Thu Sep 21 07:47:16 2017
@@ -0,0 +1,16 @@
+-BEGIN PGP SIGNATURE-
+
+iQIcBAABCgAGBQJZw234AAoJEP4oOH5D7B695xEQALNNVrsWDy6K3jmTm5n3fm+3
+cHaXdTLvrjUtnBa6TFV49JlE9OIukDDfCaZyXBfFpcUzN8vVr1Ogbe/OKr+cNM1j
+la7pRqSpqG63L2JcMq2R4htdWoS5LrJYF36DFSpmnEiRy1iS2O/aHKs8K6n5xpBO
+2UXXMBWZrn0ffaE+drrxpWW4umL2DqKibAJB9SMuqzcfKooDIjhZzulD3eH2KCZd
+4hZv8qxoSqPX6a+vBiVf1cZHJ87GgFsKA9bmaLs8uIrA7tw6qgu401lUEUJrCL80
+ge6e7wp+orhgzXPrIi0uXRQ6+vjg5jIKeARlKYYiZy/sfUxBx9zmul/4k28xpA4E
+c/xycyIoP8fe86sz6XR5DR1nVIMYCDkFcrE3gM5IzQo0yZkCrjUNBmLPcYUyFcJj
+yfC22keYIbql3GWkpFRncDtcJnBX/3qpkb6xgUCfPgzh1iATjql8tNMj5FHSWR/D
+tYU7Lomb0WQFOwVMWVk6pRfff9qzp8QdBd8K+D1PMXAtUlTTo3Ipu2+5QwdCvagU
+Qoj4LTpGtQb4QFMdGnO5i9IPiM6hAZOYv2EPLEGL0DF0ue6wDUeUWbrTEp6O8dJu
+kIn/3EQ1Px3f/iz8y66/OqoYj9yQsK0YKdnMXW0JKcqIiSs93OIW+iLRgEpWLAVl
+UUlixDnrQdWP9RPp3O/n
+=THtB
+-END PGP SIGNATURE-

Added: dev/beam/2.1.1/apache-beam-2.1.1-python.zip.md5
==
--- dev/beam/2.1.1/apache-beam-2.1.1-python.zip.md5 (added)
+++ dev/beam/2.1.1/apache-beam-2.1.1-python.zip.md5 Thu Sep 21 07:47:16 2017
@@ -0,0 +1 @@
+MD5 (apache-beam-2.1.1-python.zip) = d7ebd955bb9ec7871f23af987582f881

Added: dev/beam/2.1.1/apache-beam-2.1.1-python.zip.sha1
==
--- dev/beam/2.1.1/apache-beam-2.1.1-python.zip.sha1 (added)
+++ dev/beam/2.1.1/apache-beam-2.1.1-python.zip.sha1 Thu Sep 21 07:47:16 2017
@@ -0,0 +1 @@
+93e903076fa2efdbe9ede9b399c43da9de3d0e6a  apache-beam-2.1.1-python.zip

Added: dev/beam/2.1.1/apache-beam-2.1.1-python.zip.sha256
==
--- dev/beam/2.1.1/apache-beam-2.1.1-python.zip.sha256 (added)
+++ dev/beam/2.1.1/apache-beam-2.1.1-python.zip.sha256 Thu Sep 21 07:47:16 2017
@@ -0,0 +1 @@
+e8ebbe6b20b490357b56e1cf1cbf62e8fd915d4803108ef7af541869793b1793  
apache-beam-2.1.1-python.zip




svn commit: r21729 - /dev/beam/KEYS

2017-09-21 Thread robertwb
Author: robertwb
Date: Thu Sep 21 07:43:42 2017
New Revision: 21729

Log:
Add PGP key for Robert Bradshaw.

Modified:
dev/beam/KEYS

Modified: dev/beam/KEYS
==
--- dev/beam/KEYS (original)
+++ dev/beam/KEYS Thu Sep 21 07:43:42 2017
@@ -255,3 +255,59 @@ V3m2QtLzHBUqN+/FTzUyDASO8kO4J0OaBB6iOTlZ
 ic/pJUsMBOm0ADPVB7YO
 =5xpx
 -END PGP PUBLIC KEY BLOCK-
+pub   4096R/43EC1EBD 2017-09-21 [expires: 2021-09-21]
+uid   [ultimate] Robert Bradshaw 
+sig 343EC1EBD 2017-09-21  Robert Bradshaw 
+sub   4096R/B4A515D5 2017-09-21 [expires: 2021-09-21]
+sig  43EC1EBD 2017-09-21  Robert Bradshaw 
+-BEGIN PGP PUBLIC KEY BLOCK-
+
+mQINBFnDXGMBEADyuxKOkcO9hBk9xkKcLrHhw3s7D782fo9OrLMaMCInTcNJB/tp
+HMvOUuBlFa/2rn74ZPQhfFyTErlord4e60kTVPtKt34Ouj5udFxY8M8W0YsFOyoL
+dAHaNB6vwmAnOKgRm2PFwC1wMyeIzGinYWoml8OYpnwK3lf4MurizktHU4/h3tTg
++j2gDJDn3X2TObJ6t6Q19pyQ7yI8VdRl3KpbXSY5s6gEfcx7ihU2fg651Txmof8e
+mN5GjjkEcktkkEz8icHHI9vWLX4h7PSz2zwHm52Yet8ysVbBLB7HKnOStOpBebJf
+ja2EDt0AtbYpuOBSq+xcxKQQI44IXcqOcRNzThY7BJ62ZuMZ838oqsG6gFqOuhqA
+lkfPxubE61inQoU+ce8N9JSsjv5dcqmpE6Ul3StvZPNDf01G8E1tUzlzPt7UJiHM
+rrj3XI9oGfU+hRGaOwLne4IJqnU+noEUvBWnKICC35+R3/x9dtHe+qEKONpekPAV
+oG986icjW1L85r+6OVOJZPotzDgkQn4ArbOF92nnyXa8BJumF5ZNTr9mB3+L8pFt
+RgjLltxULDH3zejlBXgRoWfv2lVXigQNbRMlZnIKH5PwN4RwoskDbDA3WmzFfCPi
+rg7m1RPPZpbXiXAErq8vSsyceMVxgq7AFfkzwdgIHkCkwUcLlyAEd92eRQARAQAB
+tCVSb2JlcnQgQnJhZHNoYXcgPHJvYmVydHdiQGFwYWNoZS5vcmc+iQI9BBMBCgAn
+BQJZw1xjAhsDBQkHhh+ABQsJCAcDBRUKCQgLBRYCAwEAAh4BAheAAAoJEP4oOH5D
+7B69gCwQANqA3hO3yfulsf8ZZxFaFbypsit4TcThCgEStz/shs929rbsMah22REy
+x3ZUv62/2uxCnc3pncSnNVmMsECoL2uE5SKGOPGUvh3SFu5Huy6CpW2/xCLq0GSl
+AAWBqrmWI6yNGrQ77UfQAPPvFHft9NYyl1gGr3ypMDBR6I27vocT9FVLFD8Y8z5O
+FHLhO+UWpmFj2uIIBm3/UAnPu1ELn1EbiRvLq4nc59e/9ueXHIB232ndhdwxNDoc
+uQMoIP7X3s2iEVSMN5i/eNSAIdK9LBXzpsB6Kw+c9xVnIN1e2PU7zL/30EmlvQdr
+5LY3eLhT72eFkoN1adnSSHSf5WugiXDSw6yrkvpufBBW5bujh2Np/B5DbHyXq1G6
+fWUxVSCDFl+eClP3sGEoaO6SZxty+PGUnXX23CdQDbm7yhdha0sRK3i9f9v2hDN4
+89dq8YudnI4Sy14PuWSsRLadHJBFp9LIl7yajNKMAHdlRV3h01VJMYCAtjB0IicO
+FbPxNws6jkO3f3CK+MZg6oMiCQTEPrVNP027CSxUjt560sJug+vhQYGtpdi5oVP/
+zQKzxukwQEdd/bOXXjJg7H7J53H7IyVK/0fCmtGLmaJVx6ZAKE6Bjge5i21swRgN
+xCTFrKmDKG+CMPkuZyONf+T6XO4OiObB16MQua4EYbyFbaLkdzxbuQINBFnDXGMB
+EADD2sQhaY3dcIrDwjjf5kMd0LZXv8s2B8ZMSwUcm2LK5h+CbEDJizoYSBCNmPkh
+eUrTthg6JKQ0hpO51jCBz2fxTe49Wr8rzELMJs6Iu1AUsm1Z0uH0SElPHTSpPsBX
+/AbQimIPrfea4I3u0yO2VrQ2dVB4Gd0OTtY0350ld6jB7lhalxNMsBcUSYNzozwN
+QsxT9YgoO8WJJ59A6MDJDVwjgcIQBISWeW37ECtPVlrtNQpyqLlcRPQWBDeAfmR9
+MAxAmAF2AQnil4kOjeHxg9kgwUWXt3tmQTIWe79ipZYMo4q/MnLfClxbMou5HCUV
+Vnq4I4HZEU5g+zPelxsbaGhqancmhK+aynADdFKjHMFFkxDFgvYHPFXy3PVMyrHs
+c4dJw5LfPohorVOczCvC3iEuHEBezNk+tmkj/CUjLq0obhDY7UGh0xhIAfhpW0iw
+AbgkWejfwabDc2oOuf3pVUVOHgRDeuODzhIpBYOLQ+UL3slOK9V7Wsk0kC4IU22o
+SCwHcW+nGeqHMylEZqJhtjKm0aBo2JAxKw1yvRZ2gnqanUkreg+rZC3IJwrtje5t
+njcjQb8w7XQGV0pf/ddZWPG0GsRJ1DDNdvwkUymlVwKHxfe9R3O687onxxPnf80f
+2sSdIM0Uv8+F4UVjU5p0otUmT10DT8bBXStmhnpZqWjsdwARAQABiQIlBBgBCgAP
+BQJZw1xjAhsMBQkHhh+AAAoJEP4oOH5D7B69Ni4QAOZytW7pv7P0er93JlJdrL4G
+DPlY8habtjiIbXxGjr6WK4htSN+1VyiaGm6eNISa40RFSXDwgWgvoQZI+I7fha2G
+DxdA9Vn1Qs7rgcTg456PxyPV+apzaUXC/S6DSyI2ULz8IhUaGG0uxF5q4K8bf63u
+X9Ms2dlBawcK6YSVspVw70pqd/Rr/nEqrXrCkcnZEN7Fcy8EkonwhHiQnCJQnJNG
+wKEx2rMpgs7bmT+0H7wolGrMg1hpZV96H694HsI3CSe9QPrgrKzpMLla8KTTN7vZ
++i9XzzvjBojLXFFu8qw//0467RnyB/vqQ0x6nGY2qcEameQB7jKgyhIxGMQYGfI/
+qDcL0I/vJZkaK17fhrlY47V/UQVzi46CNcE3AR+lPScGwb/nOwJzXcTDCM5dHZU5
+KrxbxLZyMgWC01AJGGkVQS9RnF+WlNWqqfisFGm5ngT85iTgqkc60YV6cKjvumEz
+nalDwul4Nr9HP2oDEsjlS9+pMxqttkZQzxN7lKn5Fptf5DkiU04fOaFgwDhEvRUa
+ICu2ea0ieIQ9/G3ik/LkjQPufWUakgtnPOGgsM0geSaseUHiWrlkWJZfwnizERpA
+gNWtoMhOK+ZfVj/buLwl9yzFpF0fzpNm4yiTC3RQcfAa/auczsqymCvDiZSbEVSP
+Tk/FVT3ZFFSU1UigaXZY
+=/AhW
+-END PGP PUBLIC KEY BLOCK-




[jira] [Commented] (BEAM-793) JdbcIO can create a deadlock when parallelism is greater than 1

2017-09-21 Thread Guillaume Balaine (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174391#comment-16174391
 ] 

Guillaume Balaine commented on BEAM-793:


Hello Jean-Baptiste, thanks for being so responsive.
I copied over JdbcIO and am trying to make it work. The use case is a bounded 
data pipeline, so it does not care if to add a few seconds of processing time. 
I use the same strategy as in the Google Datastore implementation with the 
BackoffUtils to make the batch flush fn sleep increasingly in case of 
deadlocks. 
Unfortunately, it seems that my pipeline then terminates before all batches can 
go through, perhaps because of the @Teardown which, in the current JdbcIO impl, 
closes the statement not caring whether there is still a retry loop ongoing or 
not. 
I'll let you know if that works.

> JdbcIO can create a deadlock when parallelism is greater than 1
> ---
>
> Key: BEAM-793
> URL: https://issues.apache.org/jira/browse/BEAM-793
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
> Fix For: 2.2.0
>
>
> With the following JdbcIO configuration, if the parallelism is greater than 
> 1, we can have a {{Deadlock found when trying to get lock; try restarting 
> transaction}}.
> {code}
> MysqlDataSource dbCfg = new MysqlDataSource();
> dbCfg.setDatabaseName("db");
> dbCfg.setUser("user");
> dbCfg.setPassword("pass");
> dbCfg.setServerName("localhost");
> dbCfg.setPortNumber(3306);
> p.apply(Create.of(data))
> .apply(JdbcIO. Long>>write()
> 
> .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(dbCfg))
> .withStatement("INSERT INTO 
> smth(loc,event_type,hash,begin_date,end_date) VALUES(?, ?, ?, ?, ?) ON 
> DUPLICATE KEY UPDATE event_type=VALUES(event_type),end_date=VALUES(end_date)")
> .withPreparedStatementSetter(new 
> JdbcIO.PreparedStatementSetter Long>>() {
> public void setParameters(Tuple5 Integer, ByteString, Long, Long> element, PreparedStatement statement)
> throws Exception {
> statement.setInt(1, element.f0);
> statement.setInt(2, element.f1);
> statement.setBytes(3, 
> element.f2.toByteArray());
> statement.setLong(4, element.f3);
> statement.setLong(5, element.f4);
> }
> }));
> {code}
> This can happen due to the {{autocommit}}. I'm going to investigate.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #4841

2017-09-21 Thread Apache Jenkins Server
See 




Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Dataflow #4014

2017-09-21 Thread Apache Jenkins Server
See