[jira] [Closed] (BEAM-381) OffsetBasedReader should construct sources before updating the range tracker

2016-07-07 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin closed BEAM-381.


> OffsetBasedReader should construct sources before updating the range tracker
> 
>
> Key: BEAM-381
> URL: https://issues.apache.org/jira/browse/BEAM-381
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 0.1.0-incubating, 0.2.0-incubating
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
> Fix For: 0.2.0-incubating
>
>
> OffsetBasedReader has the following code:
> {code}
>   if (!rangeTracker.trySplitAtPosition(splitOffset)) {
> return null;
>   }
>   long start = source.getStartOffset();
>   long end = source.getEndOffset();
>   OffsetBasedSource primary = source.createSourceForSubrange(start, 
> splitOffset);
>   OffsetBasedSource residual = 
> source.createSourceForSubrange(splitOffset, end);
>   this.source = primary;
>   return residual;
> {code}
> The first line is the line that updates the range of this source. However, 
> subsequent lines might throw (specifically, in 
> source.createSourceForSubrange). We should construct the sources first, and 
> then catch exceptions and return null if they fail. This way, the 
> splitAtFraction call will not throw (so work is not wasted) and the range 
> tracker will not be updated if either the primary or (more likely) the 
> residual could not be created.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-381) OffsetBasedReader should construct sources before updating the range tracker

2016-07-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15367239#comment-15367239
 ] 

ASF GitHub Bot commented on BEAM-381:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/600


> OffsetBasedReader should construct sources before updating the range tracker
> 
>
> Key: BEAM-381
> URL: https://issues.apache.org/jira/browse/BEAM-381
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 0.1.0-incubating, 0.2.0-incubating
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
> Fix For: 0.2.0-incubating
>
>
> OffsetBasedReader has the following code:
> {code}
>   if (!rangeTracker.trySplitAtPosition(splitOffset)) {
> return null;
>   }
>   long start = source.getStartOffset();
>   long end = source.getEndOffset();
>   OffsetBasedSource primary = source.createSourceForSubrange(start, 
> splitOffset);
>   OffsetBasedSource residual = 
> source.createSourceForSubrange(splitOffset, end);
>   this.source = primary;
>   return residual;
> {code}
> The first line is the line that updates the range of this source. However, 
> subsequent lines might throw (specifically, in 
> source.createSourceForSubrange). We should construct the sources first, and 
> then catch exceptions and return null if they fail. This way, the 
> splitAtFraction call will not throw (so work is not wasted) and the range 
> tracker will not be updated if either the primary or (more likely) the 
> residual could not be created.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-381) OffsetBasedReader should construct sources before updating the range tracker

2016-07-07 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin resolved BEAM-381.
--
Resolution: Fixed

> OffsetBasedReader should construct sources before updating the range tracker
> 
>
> Key: BEAM-381
> URL: https://issues.apache.org/jira/browse/BEAM-381
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 0.1.0-incubating, 0.2.0-incubating
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
> Fix For: 0.2.0-incubating
>
>
> OffsetBasedReader has the following code:
> {code}
>   if (!rangeTracker.trySplitAtPosition(splitOffset)) {
> return null;
>   }
>   long start = source.getStartOffset();
>   long end = source.getEndOffset();
>   OffsetBasedSource primary = source.createSourceForSubrange(start, 
> splitOffset);
>   OffsetBasedSource residual = 
> source.createSourceForSubrange(splitOffset, end);
>   this.source = primary;
>   return residual;
> {code}
> The first line is the line that updates the range of this source. However, 
> subsequent lines might throw (specifically, in 
> source.createSourceForSubrange). We should construct the sources first, and 
> then catch exceptions and return null if they fail. This way, the 
> splitAtFraction call will not throw (so work is not wasted) and the range 
> tracker will not be updated if either the primary or (more likely) the 
> residual could not be created.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/2] incubator-beam git commit: Closes #600

2016-07-07 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/master 921c55c94 -> 74e1f83df


Closes #600


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/74e1f83d
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/74e1f83d
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/74e1f83d

Branch: refs/heads/master
Commit: 74e1f83df651af61a98c388604d6cdd4f75d0ff5
Parents: 921c55c 543842c
Author: Dan Halperin 
Authored: Thu Jul 7 22:20:03 2016 -0700
Committer: Dan Halperin 
Committed: Thu Jul 7 22:20:03 2016 -0700

--
 .../main/java/org/apache/beam/sdk/io/OffsetBasedSource.java| 6 +++---
 .../java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java   | 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)
--




[jira] [Updated] (BEAM-381) OffsetBasedReader should construct sources before updating the range tracker

2016-07-07 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin updated BEAM-381:
-
Fix Version/s: 0.2.0-incubating

> OffsetBasedReader should construct sources before updating the range tracker
> 
>
> Key: BEAM-381
> URL: https://issues.apache.org/jira/browse/BEAM-381
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 0.1.0-incubating, 0.2.0-incubating
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
> Fix For: 0.2.0-incubating
>
>
> OffsetBasedReader has the following code:
> {code}
>   if (!rangeTracker.trySplitAtPosition(splitOffset)) {
> return null;
>   }
>   long start = source.getStartOffset();
>   long end = source.getEndOffset();
>   OffsetBasedSource primary = source.createSourceForSubrange(start, 
> splitOffset);
>   OffsetBasedSource residual = 
> source.createSourceForSubrange(splitOffset, end);
>   this.source = primary;
>   return residual;
> {code}
> The first line is the line that updates the range of this source. However, 
> subsequent lines might throw (specifically, in 
> source.createSourceForSubrange). We should construct the sources first, and 
> then catch exceptions and return null if they fail. This way, the 
> splitAtFraction call will not throw (so work is not wasted) and the range 
> tracker will not be updated if either the primary or (more likely) the 
> residual could not be created.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #600: [BEAM-381] BoundedReader: update the range...

2016-07-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/600


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] incubator-beam git commit: [BEAM-381] BoundedReader: update the range last of all

2016-07-07 Thread dhalperi
[BEAM-381] BoundedReader: update the range last of all

Reorders the code in some splitAtFraction calls so that the rangeTracker update
is the last thing (besides assignment) in the function. This avoids a potential
issue if creating the primary or residual sources happens to throw an exception.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/543842cb
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/543842cb
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/543842cb

Branch: refs/heads/master
Commit: 543842cbd9d433bcef9b7962d9c71a8779e99eb5
Parents: 921c55c
Author: Dan Halperin 
Authored: Wed Jul 6 23:05:56 2016 -0700
Committer: Dan Halperin 
Committed: Thu Jul 7 22:20:03 2016 -0700

--
 .../main/java/org/apache/beam/sdk/io/OffsetBasedSource.java| 6 +++---
 .../java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java   | 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/543842cb/sdks/java/core/src/main/java/org/apache/beam/sdk/io/OffsetBasedSource.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/OffsetBasedSource.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/OffsetBasedSource.java
index d5a6801..8cbcd1f 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/OffsetBasedSource.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/OffsetBasedSource.java
@@ -370,13 +370,13 @@ public abstract class OffsetBasedSource extends 
BoundedSource {
   LOG.debug(
   "Proposing to split OffsetBasedReader {} at fraction {} (offset {})",
   rangeTracker, fraction, splitOffset);
-  if (!rangeTracker.trySplitAtPosition(splitOffset)) {
-return null;
-  }
   long start = source.getStartOffset();
   long end = source.getEndOffset();
   OffsetBasedSource primary = source.createSourceForSubrange(start, 
splitOffset);
   OffsetBasedSource residual = 
source.createSourceForSubrange(splitOffset, end);
+  if (!rangeTracker.trySplitAtPosition(splitOffset)) {
+return null;
+  }
   this.source = primary;
   return residual;
 }

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/543842cb/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
index b4c3c75..0c485bf 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
@@ -861,11 +861,11 @@ public class BigtableIO {
   }
   logger.debug(
   "Proposing to split {} at fraction {} (key {})", rangeTracker, 
fraction, splitKey);
+  BigtableSource primary = source.withEndKey(splitKey);
+  BigtableSource residual = source.withStartKey(splitKey);
   if (!rangeTracker.trySplitAtPosition(splitKey)) {
 return null;
   }
-  BigtableSource primary = source.withEndKey(splitKey);
-  BigtableSource residual = source.withStartKey(splitKey);
   this.source = primary;
   return residual;
 }



[2/2] incubator-beam git commit: Closes #608

2016-07-07 Thread dhalperi
Closes #608


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/e167d2b5
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/e167d2b5
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/e167d2b5

Branch: refs/heads/python-sdk
Commit: e167d2b5a9146f989ec069a53545e933029d7c1a
Parents: a580b31 1da908f
Author: Dan Halperin 
Authored: Thu Jul 7 22:04:21 2016 -0700
Committer: Dan Halperin 
Committed: Thu Jul 7 22:04:21 2016 -0700

--
 sdks/python/run_postcommit.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--




[GitHub] incubator-beam pull request #608: Uncomment tox in the postcommit script.

2016-07-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/608


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] incubator-beam git commit: Uncomment tox in the postcommit script.

2016-07-07 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/python-sdk a580b31ac -> e167d2b5a


Uncomment tox in the postcommit script.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/1da908f7
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/1da908f7
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/1da908f7

Branch: refs/heads/python-sdk
Commit: 1da908f749dfd8bc66d8c0bda4b0302cc0f343fa
Parents: a580b31
Author: Ahmet Altay 
Authored: Thu Jul 7 16:38:52 2016 -0700
Committer: Ahmet Altay 
Committed: Thu Jul 7 16:38:52 2016 -0700

--
 sdks/python/run_postcommit.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/1da908f7/sdks/python/run_postcommit.sh
--
diff --git a/sdks/python/run_postcommit.sh b/sdks/python/run_postcommit.sh
index 7af4e6c..23dd516 100755
--- a/sdks/python/run_postcommit.sh
+++ b/sdks/python/run_postcommit.sh
@@ -38,7 +38,7 @@ pip install virtualenv --user
 pip install tox --user
 
 # Tox runs unit tests in a virtual environment
-# ${LOCAL_PATH}/tox -e py27 -c sdks/python/tox.ini
+${LOCAL_PATH}/tox -e py27 -c sdks/python/tox.ini
 
 # Virtualenv for the rest of the script to run setup & e2e tests
 ${LOCAL_PATH}/virtualenv sdks/python



[GitHub] incubator-beam pull request #603: Modified addBulkOptions for simplicity

2016-07-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/603


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] incubator-beam git commit: Modified addBulkOptions for simplicity

2016-07-07 Thread dhalperi
Modified addBulkOptions for simplicity


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/b9231826
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/b9231826
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/b9231826

Branch: refs/heads/master
Commit: b9231826bb9c8084f3802206fbdd1d9f69fea3a6
Parents: 155409b
Author: Ian Zhou 
Authored: Thu Jul 7 10:27:47 2016 -0700
Committer: Dan Halperin 
Committed: Thu Jul 7 21:53:35 2016 -0700

--
 .../beam/sdk/io/gcp/bigtable/BigtableIO.java| 95 +--
 .../sdk/io/gcp/bigtable/BigtableIOTest.java | 99 ++--
 2 files changed, 99 insertions(+), 95 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/b9231826/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
index dd17abe..b4c3c75 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
@@ -47,7 +47,6 @@ import com.google.bigtable.v1.Row;
 import com.google.bigtable.v1.RowFilter;
 import com.google.bigtable.v1.SampleRowKeysResponse;
 import com.google.cloud.bigtable.config.BigtableOptions;
-import com.google.cloud.bigtable.config.BulkOptions;
 import com.google.cloud.bigtable.config.RetryOptions;
 import com.google.common.base.MoreObjects;
 import com.google.common.collect.ImmutableList;
@@ -207,10 +206,23 @@ public class BigtableIO {
 public Read withBigtableOptions(BigtableOptions.Builder optionsBuilder) {
   checkNotNull(optionsBuilder, "optionsBuilder");
   // TODO: is there a better way to clone a Builder? Want it to be immune 
from user changes.
-  BigtableOptions.Builder clonedBuilder = 
optionsBuilder.build().toBuilder();
-  clonedBuilder.setDataChannelCount(1);
-  clonedBuilder = addRetryOptions(clonedBuilder);
+  BigtableOptions options = optionsBuilder.build();
+  RetryOptions retryOptions = options.getRetryOptions();
+
+  // Set data channel count to one because there is only 1 scanner in this 
session
+  // Use retryOptionsToBuilder because absent in Bigtable library
+  // TODO: replace with RetryOptions.toBuilder() when added to Bigtable 
library
+  // Set batch size because of bug (incorrect initialization) in Bigtable 
library
+  // TODO: remove setRetryOptions when fixed in Bigtable library
+  BigtableOptions.Builder clonedBuilder = options.toBuilder()
+  .setDataChannelCount(1)
+  .setRetryOptions(
+  retryOptionsToBuilder(retryOptions)
+  
.setStreamingBatchSize(Math.min(retryOptions.getStreamingBatchSize(),
+  retryOptions.getStreamingBufferSize() / 2))
+  .build());
   BigtableOptions optionsWithAgent = 
clonedBuilder.setUserAgent(getUserAgent()).build();
+
   return new Read(optionsWithAgent, tableId, filter, bigtableService);
 }
 
@@ -393,9 +405,24 @@ public class BigtableIO {
 public Write withBigtableOptions(BigtableOptions.Builder optionsBuilder) {
   checkNotNull(optionsBuilder, "optionsBuilder");
   // TODO: is there a better way to clone a Builder? Want it to be immune 
from user changes.
-  BigtableOptions.Builder clonedBuilder = 
optionsBuilder.build().toBuilder();
-  clonedBuilder = addBulkOptions(clonedBuilder);
-  clonedBuilder = addRetryOptions(clonedBuilder);
+  BigtableOptions options = optionsBuilder.build();
+  RetryOptions retryOptions = options.getRetryOptions();
+
+  // Set useBulkApi to true for enabling bulk writes
+  // Use retryOptionsToBuilder because absent in Bigtable library
+  // TODO: replace with RetryOptions.toBuilder() when added to Bigtable 
library
+  // Set batch size because of bug (incorrect initialization) in Bigtable 
library
+  // TODO: remove setRetryOptions when fixed in Bigtable library
+  BigtableOptions.Builder clonedBuilder = options.toBuilder()
+  .setBulkOptions(
+  options.getBulkOptions().toBuilder()
+  .setUseBulkApi(true)
+  .build())
+  .setRetryOptions(
+  retryOptionsToBuilder(retryOptions)
+  
.setStreamingBatchSize(Math.min(retryOptions.getStreamingBatchSize(),
+  

[1/2] incubator-beam git commit: Closes #603

2016-07-07 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/master 155409bf6 -> 290c0b772


Closes #603


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/290c0b77
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/290c0b77
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/290c0b77

Branch: refs/heads/master
Commit: 290c0b77280b5fab3b656b3009bfdc897784c6b5
Parents: 155409b b923182
Author: Dan Halperin 
Authored: Thu Jul 7 21:53:35 2016 -0700
Committer: Dan Halperin 
Committed: Thu Jul 7 21:53:35 2016 -0700

--
 .../beam/sdk/io/gcp/bigtable/BigtableIO.java| 95 +--
 .../sdk/io/gcp/bigtable/BigtableIOTest.java | 99 ++--
 2 files changed, 99 insertions(+), 95 deletions(-)
--




[jira] [Resolved] (BEAM-289) Examples Use TypeDescriptors

2016-07-07 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles resolved BEAM-289.
--
   Resolution: Fixed
Fix Version/s: 0.1.0-incubating

> Examples Use TypeDescriptors
> 
>
> Key: BEAM-289
> URL: https://issues.apache.org/jira/browse/BEAM-289
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-java
>Reporter: Jesse Anderson
>Assignee: Frances Perry
> Fix For: 0.1.0-incubating
>
>
> Change the Java and Java 8 examples to use TypeDescriptors instead of inline 
> TypeDescriptor creation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-289) Examples Use TypeDescriptors

2016-07-07 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-289:
-
Fix Version/s: (was: 0.1.0-incubating)
   0.2.0-incubating

> Examples Use TypeDescriptors
> 
>
> Key: BEAM-289
> URL: https://issues.apache.org/jira/browse/BEAM-289
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-java
>Reporter: Jesse Anderson
>Assignee: Frances Perry
> Fix For: 0.2.0-incubating
>
>
> Change the Java and Java 8 examples to use TypeDescriptors instead of inline 
> TypeDescriptor creation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-431) Examples dependencies on runners are a bit much and not enough

2016-07-07 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-431:


 Summary: Examples dependencies on runners are a bit much and not 
enough
 Key: BEAM-431
 URL: https://issues.apache.org/jira/browse/BEAM-431
 Project: Beam
  Issue Type: Bug
  Components: examples-java
Reporter: Kenneth Knowles
Assignee: Kenneth Knowles


The Java 7 examples directly depend on the Dataflow runner as a compile 
dependency. This should just be fixed and removed.

The Java 8 examples have optional runtime dependencies on the Dataflow and 
Flink runners. But even optional runtime dependencies must be resolved in a 
test scope, so it is not possible to exclude these from a hermetic testing 
environment - quite annoying. And the Spark runner should be included as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-429) minor: remove an obsolete comment in KafakIOTest.java

2016-07-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366949#comment-15366949
 ] 

ASF GitHub Bot commented on BEAM-429:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/606


> minor: remove an obsolete comment in KafakIOTest.java
> -
>
> Key: BEAM-429
> URL: https://issues.apache.org/jira/browse/BEAM-429
> Project: Beam
>  Issue Type: Improvement
>Reporter: Raghu Angadi
>Assignee: Raghu Angadi
>Priority: Minor
>
> see https://github.com/apache/incubator-beam/pull/606



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/2] incubator-beam git commit: Closes #606

2016-07-07 Thread dhalperi
Closes #606


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/155409bf
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/155409bf
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/155409bf

Branch: refs/heads/master
Commit: 155409bf6041380c309e4452719763a683d41936
Parents: 1963bde 1c9d16d
Author: Dan Halperin 
Authored: Thu Jul 7 16:13:36 2016 -0700
Committer: Dan Halperin 
Committed: Thu Jul 7 16:13:36 2016 -0700

--
 .../src/test/java/org/apache/beam/sdk/io/kafka/KafkaIOTest.java | 1 -
 1 file changed, 1 deletion(-)
--




[GitHub] incubator-beam pull request #606: [BEAM-429] remove an obsolete comment in K...

2016-07-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/606


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] incubator-beam git commit: remove an obsolete comment in KafkaIOTest.java

2016-07-07 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/master 1963bde44 -> 155409bf6


remove an obsolete comment in KafkaIOTest.java

Kafka 10.0 added compatible constructor for ConsumerRecord. It was not present 
when I wrote this test. 
https://github.com/apache/kafka/commit/4e557f8ef60d46a8870704655c9a35092f74d125

Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/1c9d16d7
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/1c9d16d7
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/1c9d16d7

Branch: refs/heads/master
Commit: 1c9d16d70ad2bd55b0736ea37f9357029f39ade5
Parents: 1963bde
Author: Raghu Angadi 
Authored: Thu Jul 7 14:22:41 2016 -0700
Committer: GitHub 
Committed: Thu Jul 7 14:22:41 2016 -0700

--
 .../src/test/java/org/apache/beam/sdk/io/kafka/KafkaIOTest.java | 1 -
 1 file changed, 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/1c9d16d7/sdks/java/io/kafka/src/test/java/org/apache/beam/sdk/io/kafka/KafkaIOTest.java
--
diff --git 
a/sdks/java/io/kafka/src/test/java/org/apache/beam/sdk/io/kafka/KafkaIOTest.java
 
b/sdks/java/io/kafka/src/test/java/org/apache/beam/sdk/io/kafka/KafkaIOTest.java
index 587e3e2..dd93823 100644
--- 
a/sdks/java/io/kafka/src/test/java/org/apache/beam/sdk/io/kafka/KafkaIOTest.java
+++ 
b/sdks/java/io/kafka/src/test/java/org/apache/beam/sdk/io/kafka/KafkaIOTest.java
@@ -126,7 +126,6 @@ public class KafkaIOTest {
 records.put(tp, new ArrayList>());
   }
   records.get(tp).add(
-  // Note: this interface has changed in 0.10. may get fixed before 
the release.
   new ConsumerRecord(
   tp.topic(),
   tp.partition(),



[GitHub] incubator-beam pull request #607: Mark primitive display data tests Runnable...

2016-07-07 Thread swegner
GitHub user swegner opened a pull request:

https://github.com/apache/incubator-beam/pull/607

Mark primitive display data tests RunnableOnService

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/swegner/incubator-beam displaydata-primitives

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/607.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #607


commit 0d30860483a2c8f3d3efde9a45a162e5b8ece476
Author: Scott Wegner 
Date:   2016-07-07T20:49:30Z

Mark primitive display data tests RunnableOnService




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (BEAM-391) Exceptions in gcsio upload thread causes pipeline to stall

2016-07-07 Thread Ahmet Altay (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-391:
-
Summary: Exceptions in gcsio upload thread causes pipeline to stall  (was: 
Invalid GCS bucket name causes pipeline to stall)

> Exceptions in gcsio upload thread causes pipeline to stall
> --
>
> Key: BEAM-391
> URL: https://issues.apache.org/jira/browse/BEAM-391
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Ahmet Altay
>
> gcsio got stuck with invalid bucket name
> GcsBufferedWriter._start_upload (gcsio.py) raises an exception if the bucket 
> does not exist. This causes upload thread to silenty fail. It logs exception 
> to the log but this does not stop the pipeline or closes the receiving end of 
> the multiprocessing.Pipe(). Later a call in to write() blocks at 
> self.conn.send_bytes(). Note that send may block if the buffer is full.
> Upload thread should have a finally clause to close the socket connection. Or 
> better propagating the exception to its parent. This is true for other types 
> of exceptions also.
> Another small issue in the GcsBufferedWriter.close(). It does not self 
> self.close to True.
> reproduction: python -m apache_beam.examples.wordcount --output 
> gs://no-such-thing/
> Prints the exception but goes on forever. Ctrl + C breaks the main thread 
> shows where it got stuck.
> Similarly reproducible on the service.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-430) Introducing gcpTempLocation that default to tempLocation

2016-07-07 Thread Pei He (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366882#comment-15366882
 ] 

Pei He commented on BEAM-430:
-

Since stagingLocation will default to gcpTempLocation, and gcpTempLocation will 
default to tempLocation, DataflowRunner cannot use stagingLocation as the 
default value for tempLocation.
We will break the dependency cycle between stagingLocation and tempLocation, 
which is currently in DataflowRunner.

> Introducing gcpTempLocation that default to tempLocation
> 
>
> Key: BEAM-430
> URL: https://issues.apache.org/jira/browse/BEAM-430
> Project: Beam
>  Issue Type: Improvement
>Reporter: Pei He
>Assignee: Pei He
>Priority: Minor
>
> Currently, DataflowPipelineOptions.stagingLocation default to tempLocation. 
> And, it requires tempLocation to be a gcs path.
> Another case is BigQueryIO uses tempLocation and also requires it to be on 
> gcs.
> So, users cannot set tempLocation to a non-gcs path with DataflowRunner or 
> BigQueryIO.
> However, tempLocation could be on any file system. For example, WordCount 
> defaults to output to tempLocation.
> The proposal is to add gcpTempLocation. And, it defaults to tempLocation if 
> tempLocation is a gcs path.
> StagingLocation and BigQueryIO will use gcpTempLocation by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-429) minor: remove an obsolete comment in KafakIOTest.java

2016-07-07 Thread Raghu Angadi (JIRA)
Raghu Angadi created BEAM-429:
-

 Summary: minor: remove an obsolete comment in KafakIOTest.java
 Key: BEAM-429
 URL: https://issues.apache.org/jira/browse/BEAM-429
 Project: Beam
  Issue Type: Improvement
Reporter: Raghu Angadi
Assignee: Raghu Angadi
Priority: Minor


see https://github.com/apache/incubator-beam/pull/606



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #606: remove an obsolete comment in KafkaIOTest....

2016-07-07 Thread rangadi
GitHub user rangadi opened a pull request:

https://github.com/apache/incubator-beam/pull/606

remove an obsolete comment in KafkaIOTest.java

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

Kafka 10.0 added compatible constructor for ConsumerRecord. It was not 
present when I wrote this test. 
https://github.com/apache/kafka/commit/4e557f8ef60d46a8870704655c9a35092f74d125

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rangadi/incubator-beam patch-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/606.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #606






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-beam pull request #604: Add support for ZLIB and DEFLATE compressi...

2016-07-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/604


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] incubator-beam git commit: Add support for ZLIB and DEFLATE compression

2016-07-07 Thread robertwb
Repository: incubator-beam
Updated Branches:
  refs/heads/python-sdk 9bc04b750 -> a580b31ac


Add support for ZLIB and DEFLATE compression

Code originally contributed by Slaven Bilac.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/e1b3ac30
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/e1b3ac30
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/e1b3ac30

Branch: refs/heads/python-sdk
Commit: e1b3ac30b5dd5b14058247fad73b6b235e618094
Parents: 9bc04b7
Author: Robert Bradshaw 
Authored: Thu Jul 7 13:40:18 2016 -0700
Committer: Robert Bradshaw 
Committed: Thu Jul 7 13:40:18 2016 -0700

--
 sdks/python/apache_beam/io/fileio.py  | 168 +
 sdks/python/apache_beam/io/fileio_test.py |  30 -
 2 files changed, 173 insertions(+), 25 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/e1b3ac30/sdks/python/apache_beam/io/fileio.py
--
diff --git a/sdks/python/apache_beam/io/fileio.py 
b/sdks/python/apache_beam/io/fileio.py
index 6475a34..31b6a93 100644
--- a/sdks/python/apache_beam/io/fileio.py
+++ b/sdks/python/apache_beam/io/fileio.py
@@ -20,7 +20,6 @@
 from __future__ import absolute_import
 
 import glob
-import gzip
 import logging
 from multiprocessing.pool import ThreadPool
 import os
@@ -28,6 +27,7 @@ import re
 import shutil
 import tempfile
 import time
+import zlib
 
 from apache_beam import coders
 from apache_beam.io import iobase
@@ -269,13 +269,129 @@ class _CompressionType(object):
 class CompressionTypes(object):
   """Enum-like class representing known compression types."""
   NO_COMPRESSION = _CompressionType(1)  # No compression.
-  DEFLATE = _CompressionType(2)  # 'Deflate' ie gzip compression.
+  DEFLATE = _CompressionType(2)  # 'Deflate' compression (without headers).
+  GZIP = _CompressionType(3)  # gzip compression (deflate with gzip headers).
+  ZLIB = _CompressionType(4)  # zlib compression (deflate with zlib headers).
 
   @staticmethod
-  def valid_compression_type(compression_type):
+  def is_valid_compression_type(compression_type):
 """Returns true for valid compression types, false otherwise."""
 return isinstance(compression_type, _CompressionType)
 
+  @staticmethod
+  def mime_type(compression_type, default='application/octet-stream'):
+if compression_type == CompressionTypes.GZIP:
+  return 'application/x-gzip'
+elif compression_type == CompressionTypes.ZLIB:
+  return 'application/octet-stream'
+elif compression_type == CompressionTypes.DEFLATE:
+  return 'application/octet-stream'
+else:
+  return default
+
+
+class _CompressedFile(object):
+  """Somewhat limited file wrapper for easier handling of compressed files."""
+  _type_mask = {
+  CompressionTypes.ZLIB:  zlib.MAX_WBITS,
+  CompressionTypes.GZIP: zlib.MAX_WBITS | 16,
+  CompressionTypes.DEFLATE: -zlib.MAX_WBITS,
+  }
+
+  def __init__(self,
+   fileobj=None,
+   compression_type=CompressionTypes.ZLIB,
+   read_size=16384):
+self._validate_compression_type(compression_type)
+if not fileobj:
+  raise ValueError('fileobj must be opened file but was %s' % fileobj)
+
+self.fileobj = fileobj
+self.data = ''
+self.read_size = read_size
+self.compression_type = compression_type
+if self._readable():
+  self.decompressor = self._create_decompressor(self.compression_type)
+else:
+  self.decompressor = None
+if self._writeable():
+  self.compressor = self._create_compressor(self.compression_type)
+else:
+  self.compressor = None
+
+  def _validate_compression_type(self, compression_type):
+if not CompressionTypes.is_valid_compression_type(compression_type):
+  raise TypeError('compression_type must be CompressionType object but '
+  'was %s' % type(compression_type))
+if compression_type == CompressionTypes.NO_COMPRESSION:
+  raise ValueError('cannot create object with no compression')
+
+  def _create_compressor(self, compression_type):
+self._validate_compression_type(compression_type)
+return zlib.compressobj(9, zlib.DEFLATED,
+self._type_mask[compression_type])
+
+  def _create_decompressor(self, compression_type):
+self._validate_compression_type(compression_type)
+return zlib.decompressobj(self._type_mask[compression_type])
+
+  def _readable(self):
+mode = self.fileobj.mode
+return 'r' in mode or 'a' in mode
+
+  def _writeable(self):
+mode = self.fileobj.mode
+return 'w' in mode or 'a' in mode
+
+  def write(self, data):
+"""Write data to file."""
+if not 

[2/2] incubator-beam git commit: Closes #604

2016-07-07 Thread robertwb
Closes #604


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/a580b31a
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/a580b31a
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/a580b31a

Branch: refs/heads/python-sdk
Commit: a580b31ac5480318e632c3d8e33ea0bcfe5177d3
Parents: 9bc04b7 e1b3ac3
Author: Robert Bradshaw 
Authored: Thu Jul 7 14:22:22 2016 -0700
Committer: Robert Bradshaw 
Committed: Thu Jul 7 14:22:22 2016 -0700

--
 sdks/python/apache_beam/io/fileio.py  | 168 +
 sdks/python/apache_beam/io/fileio_test.py |  30 -
 2 files changed, 173 insertions(+), 25 deletions(-)
--




[jira] [Commented] (BEAM-320) Provide Beam keyturn binary distributions embedding runners and execution runtime

2016-07-07 Thread Vlad Rozov (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366779#comment-15366779
 ] 

Vlad Rozov commented on BEAM-320:
-

Apache Beam can do binary release that includes necessary binaries for runners, 
and publish it to the Apache Beam wiki, svn repository along with the source 
release, maven repository or other similar resources. I suspect that runner 
providers such as Apache Flink, Apache Apex or Apache Spark will provide their 
binary distributions as well. Such binary distributions are different from the 
official Apache source release and if somebody uses Apache Beam source release 
to build binaries, she will need to download or build dependent run-time 
libraries.

> Provide Beam keyturn binary distributions embedding runners and execution 
> runtime
> -
>
> Key: BEAM-320
> URL: https://issues.apache.org/jira/browse/BEAM-320
> Project: Beam
>  Issue Type: Wish
>  Components: build-system
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>
> Now, the only distribution Beam provides is the source distribution.
> For new users, it could be interesting to have ready-to-use binary 
> distribution embedding the SDK, a specific runner with the backend execution 
> runtime.
> For instance, we could provide:
> - beam-spark-xxx.tar.gz containing SDK, Spark runner, Spark
> - beam-flink-xxx.tar.gz containing SDK, Flink runner, Flink
> Thoughts ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #605: Rename DataflowExampleUtils and DataflowEx...

2016-07-07 Thread peihe
GitHub user peihe opened a pull request:

https://github.com/apache/incubator-beam/pull/605

Rename DataflowExampleUtils and DataflowExampleOptions



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/peihe/incubator-beam renaming

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/605.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #605


commit e1af028eed98e3b06b926f947494d46df0d9df36
Author: Pei He 
Date:   2016-07-07T20:45:24Z

Rename DataflowExampleUtils and DataflowExampleOptions




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-320) Provide Beam keyturn binary distributions embedding runners and execution runtime

2016-07-07 Thread Amit Sela (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366734#comment-15366734
 ] 

Amit Sela commented on BEAM-320:


What about execution runtime source packaging ?

On one hand it would be easier for the user to have the runner dependency 
"bring-along" the execution runtime as well (like the flink-runner does), on 
the other hand I'm not an Apache Spark PMC or even a Committer, and I'm not 
sure if doing such a thing would be OK. Sounds like the people who own this 
code have to make this call. BTW [~jbonofre] how does it usually work ? How 
does it work with Hadoop across all projects using it ?

[~vrozov] mentioned that it's against Apache policy to include compiled 
binaries in the source release, how does this affect the choices we have ?

Adding a link to the conversation thread: 
http://mail-archives.apache.org/mod_mbox/incubator-beam-dev/201607.mbox/%3ccabqdq5b8jf20d97e1xvxf+lpzc7sk3irabq33ruposwsuxg...@mail.gmail.com%3E
   

> Provide Beam keyturn binary distributions embedding runners and execution 
> runtime
> -
>
> Key: BEAM-320
> URL: https://issues.apache.org/jira/browse/BEAM-320
> Project: Beam
>  Issue Type: Wish
>  Components: build-system
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>
> Now, the only distribution Beam provides is the source distribution.
> For new users, it could be interesting to have ready-to-use binary 
> distribution embedding the SDK, a specific runner with the backend execution 
> runtime.
> For instance, we could provide:
> - beam-spark-xxx.tar.gz containing SDK, Spark runner, Spark
> - beam-flink-xxx.tar.gz containing SDK, Flink runner, Flink
> Thoughts ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #604: Add support for ZLIB and DEFLATE compressi...

2016-07-07 Thread robertwb
GitHub user robertwb opened a pull request:

https://github.com/apache/incubator-beam/pull/604

Add support for ZLIB and DEFLATE compression

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

Code originally contributed by Slaven Bilac.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/robertwb/incubator-beam compress

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/604.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #604


commit e1b3ac30b5dd5b14058247fad73b6b235e618094
Author: Robert Bradshaw 
Date:   2016-07-07T20:40:18Z

Add support for ZLIB and DEFLATE compression

Code originally contributed by Slaven Bilac.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-124) Testing -- End to End WordCount Batch and Streaming Tests

2016-07-07 Thread Jason Kuster (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366643#comment-15366643
 ] 

Jason Kuster commented on BEAM-124:
---

Update: Batch WordCount E2E has been in for a while. Verifiers are coming in a 
pull request soon, and then we'll look at expanding to a streaming test as well.

> Testing -- End to End WordCount Batch and Streaming Tests
> -
>
> Key: BEAM-124
> URL: https://issues.apache.org/jira/browse/BEAM-124
> Project: Beam
>  Issue Type: New Feature
>  Components: testing
>Reporter: Steve Wheeler
>Assignee: Jason Kuster
>
> Set up testing infrastructure so that an end to end test for WordCount (both 
> batch and streaming) will be run periodically. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (BEAM-124) Testing -- End to End WordCount Batch and Streaming Tests

2016-07-07 Thread Jason Kuster (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366643#comment-15366643
 ] 

Jason Kuster edited comment on BEAM-124 at 7/7/16 7:28 PM:
---

Update: Batch WordCount E2E has been in for a while. Verifiers are coming in a 
pull request soon, and then we'll look at expanding to a streaming test as 
well. Work is in-progress to get them running on all runners.


was (Author: jasonkuster):
Update: Batch WordCount E2E has been in for a while. Verifiers are coming in a 
pull request soon, and then we'll look at expanding to a streaming test as well.

> Testing -- End to End WordCount Batch and Streaming Tests
> -
>
> Key: BEAM-124
> URL: https://issues.apache.org/jira/browse/BEAM-124
> Project: Beam
>  Issue Type: New Feature
>  Components: testing
>Reporter: Steve Wheeler
>Assignee: Jason Kuster
>
> Set up testing infrastructure so that an end to end test for WordCount (both 
> batch and streaming) will be run periodically. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (BEAM-228) Create a merge bot for Beam

2016-07-07 Thread Jason Kuster (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Kuster closed BEAM-228.
-
   Resolution: Won't Fix
Fix Version/s: Not applicable

Mergebot is being delivered to apache infra instead - closing as obsolete.

> Create a merge bot for Beam
> ---
>
> Key: BEAM-228
> URL: https://issues.apache.org/jira/browse/BEAM-228
> Project: Beam
>  Issue Type: New Feature
>  Components: project-management
>Reporter: Jason Kuster
>Assignee: Jason Kuster
> Fix For: Not applicable
>
>
> This issue tracks the creation of a merge bot for Beam. This merge bot should 
> watch the Beam github repository and queue and merge pull requests which are 
> marked LGTM and good for merge by an approved Beam committer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[3/4] incubator-beam git commit: Better error message for poor use of callable apply

2016-07-07 Thread robertwb
Better error message for poor use of callable apply


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/c34f332a
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/c34f332a
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/c34f332a

Branch: refs/heads/python-sdk
Commit: c34f332a3ae6e4ad914965732f6a038a883a5b3b
Parents: 31b3f00
Author: Robert Bradshaw 
Authored: Wed Jul 6 15:42:55 2016 -0700
Committer: Robert Bradshaw 
Committed: Thu Jul 7 11:50:50 2016 -0700

--
 sdks/python/apache_beam/pipeline.py  |  4 
 sdks/python/apache_beam/pipeline_test.py | 13 -
 2 files changed, 16 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/c34f332a/sdks/python/apache_beam/pipeline.py
--
diff --git a/sdks/python/apache_beam/pipeline.py 
b/sdks/python/apache_beam/pipeline.py
index a84cec3..012d4d9 100644
--- a/sdks/python/apache_beam/pipeline.py
+++ b/sdks/python/apache_beam/pipeline.py
@@ -47,6 +47,7 @@ import logging
 import os
 import shutil
 import tempfile
+import types
 
 from apache_beam import pvalue
 from apache_beam import typehints
@@ -196,6 +197,9 @@ class Pipeline(object):
 and needs to be cloned in order to apply again.
 """
 if not isinstance(transform, ptransform.PTransform):
+  if isinstance(transform, (type, types.ClassType)):
+raise TypeError("%s is not a PTransform instance, did you mean %s()?"
+% (transform, transform.__name__))
   transform = _CallableWrapperPTransform(transform)
 
 full_label = format_full_label(self._current_transform(), transform)

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/c34f332a/sdks/python/apache_beam/pipeline_test.py
--
diff --git a/sdks/python/apache_beam/pipeline_test.py 
b/sdks/python/apache_beam/pipeline_test.py
index 5e94087..8598737 100644
--- a/sdks/python/apache_beam/pipeline_test.py
+++ b/sdks/python/apache_beam/pipeline_test.py
@@ -32,6 +32,7 @@ from apache_beam.transforms import Create
 from apache_beam.transforms import FlatMap
 from apache_beam.transforms import Flatten
 from apache_beam.transforms import Map
+from apache_beam.transforms import GroupByKey
 from apache_beam.transforms import PTransform
 from apache_beam.transforms import Read
 from apache_beam.transforms.util import assert_that, equal_to
@@ -174,10 +175,20 @@ class PipelineTest(unittest.TestCase):
   def test_apply_custom_callable(self):
 pipeline = Pipeline(self.runner_name)
 pcoll = pipeline | Create('pcoll', [1, 2, 3])
-result = pipeline.apply(PipelineTest.custom_callable, pcoll)
+result = pcoll | PipelineTest.custom_callable
 assert_that(result, equal_to([2, 3, 4]))
 pipeline.run()
 
+  def test_apply_custom_callable_error(self):
+pipeline = Pipeline(self.runner_name)
+pcoll = pipeline | Create('pcoll', [1, 2, 3])
+with self.assertRaises(TypeError) as cm:
+  pcoll | GroupByKey  # Note the missing ()'s
+self.assertEqual(
+cm.exception.message,
+" is not "
+"a PTransform instance, did you mean GroupByKey()?")
+
   def test_transform_no_super_init(self):
 class AddSuffix(PTransform):
 



[2/4] incubator-beam git commit: Cleanup dataflow_test.

2016-07-07 Thread robertwb
Cleanup dataflow_test.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/e0643774
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/e0643774
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/e0643774

Branch: refs/heads/python-sdk
Commit: e06437746d0052f8e4af3edd5b19e4369038f826
Parents: 342d2d7
Author: Robert Bradshaw 
Authored: Wed Jul 6 14:45:58 2016 -0700
Committer: Robert Bradshaw 
Committed: Thu Jul 7 11:50:49 2016 -0700

--
 sdks/python/apache_beam/dataflow_test.py | 13 ++---
 1 file changed, 2 insertions(+), 11 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/e0643774/sdks/python/apache_beam/dataflow_test.py
--
diff --git a/sdks/python/apache_beam/dataflow_test.py 
b/sdks/python/apache_beam/dataflow_test.py
index d3721ee..c4933af 100644
--- a/sdks/python/apache_beam/dataflow_test.py
+++ b/sdks/python/apache_beam/dataflow_test.py
@@ -46,7 +46,7 @@ from apache_beam.transforms.window import WindowFn
 class DataflowTest(unittest.TestCase):
   """Dataflow integration tests."""
 
-  SAMPLE_DATA = 'aa bb cc aa bb aa \n' * 10
+  SAMPLE_DATA = ['aa bb cc aa bb aa \n'] * 10
   SAMPLE_RESULT = [('cc', 10), ('bb', 20), ('aa', 30)]
 
   # TODO(silviuc): Figure out a nice way to specify labels for stages so that
@@ -61,7 +61,7 @@ class DataflowTest(unittest.TestCase):
 
   def test_word_count(self):
 pipeline = Pipeline('DirectPipelineRunner')
-lines = pipeline | Create('SomeWords', [DataflowTest.SAMPLE_DATA])
+lines = pipeline | Create('SomeWords', DataflowTest.SAMPLE_DATA)
 result = (
 (lines | FlatMap('GetWords', lambda x: re.findall(r'\w+', x)))
 .apply('CountWords', DataflowTest.Count))
@@ -77,15 +77,6 @@ class DataflowTest(unittest.TestCase):
 assert_that(result, equal_to(['foo-A', 'foo-B', 'foo-C']))
 pipeline.run()
 
-  def test_word_count_using_get(self):
-pipeline = Pipeline('DirectPipelineRunner')
-lines = pipeline | Create('SomeWords', [DataflowTest.SAMPLE_DATA])
-result = (
-(lines | FlatMap('GetWords', lambda x: re.findall(r'\w+', x)))
-.apply('CountWords', DataflowTest.Count))
-assert_that(result, equal_to(DataflowTest.SAMPLE_RESULT))
-pipeline.run()
-
   def test_par_do_with_side_input_as_arg(self):
 pipeline = Pipeline('DirectPipelineRunner')
 words_list = ['aa', 'bb', 'cc']



[4/4] incubator-beam git commit: Closes #595

2016-07-07 Thread robertwb
Closes #595


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/9bc04b75
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/9bc04b75
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/9bc04b75

Branch: refs/heads/python-sdk
Commit: 9bc04b7507b1e01688698ecc508468932042f5b8
Parents: 342d2d7 c34f332
Author: Robert Bradshaw 
Authored: Thu Jul 7 11:50:50 2016 -0700
Committer: Robert Bradshaw 
Committed: Thu Jul 7 11:50:50 2016 -0700

--
 sdks/python/apache_beam/dataflow_test.py| 13 ++--
 .../examples/cookbook/custom_ptransform.py  |  7 ++--
 sdks/python/apache_beam/pipeline.py |  4 +++
 sdks/python/apache_beam/pipeline_test.py| 13 +++-
 sdks/python/apache_beam/transforms/combiners.py | 34 +---
 sdks/python/apache_beam/transforms/core.py  |  4 +--
 .../python/apache_beam/transforms/ptransform.py |  8 ++---
 .../apache_beam/transforms/ptransform_test.py   |  7 ++--
 sdks/python/apache_beam/transforms/util.py  |  8 ++---
 9 files changed, 48 insertions(+), 50 deletions(-)
--




[1/4] incubator-beam git commit: Remove unneeded label argument in ptransform_fn

2016-07-07 Thread robertwb
Repository: incubator-beam
Updated Branches:
  refs/heads/python-sdk 342d2d798 -> 9bc04b750


Remove unneeded label argument in ptransform_fn

The label is already in the fully qualified name due to nesting.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/31b3f00c
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/31b3f00c
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/31b3f00c

Branch: refs/heads/python-sdk
Commit: 31b3f00cf709ee06fd6f3d38567404861c0ae244
Parents: e064377
Author: Robert Bradshaw 
Authored: Wed Jul 6 15:15:51 2016 -0700
Committer: Robert Bradshaw 
Committed: Thu Jul 7 11:50:49 2016 -0700

--
 .../examples/cookbook/custom_ptransform.py  |  7 ++--
 sdks/python/apache_beam/transforms/combiners.py | 34 +---
 sdks/python/apache_beam/transforms/core.py  |  4 +--
 .../python/apache_beam/transforms/ptransform.py |  8 ++---
 .../apache_beam/transforms/ptransform_test.py   |  7 ++--
 sdks/python/apache_beam/transforms/util.py  |  8 ++---
 6 files changed, 30 insertions(+), 38 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/31b3f00c/sdks/python/apache_beam/examples/cookbook/custom_ptransform.py
--
diff --git a/sdks/python/apache_beam/examples/cookbook/custom_ptransform.py 
b/sdks/python/apache_beam/examples/cookbook/custom_ptransform.py
index f97545a..8da1f43 100644
--- a/sdks/python/apache_beam/examples/cookbook/custom_ptransform.py
+++ b/sdks/python/apache_beam/examples/cookbook/custom_ptransform.py
@@ -57,7 +57,7 @@ def run_count2(known_args, options):
   """Runs the second example pipeline."""
 
   @beam.ptransform_fn
-  def Count(label, pcoll):  # pylint: disable=invalid-name,unused-argument
+  def Count(pcoll):  # pylint: disable=invalid-name
 """Count as a decorated function."""
 return (
 pcoll
@@ -76,12 +76,11 @@ def run_count3(known_args, options):
   """Runs the third example pipeline."""
 
   @beam.ptransform_fn
-  # pylint: disable=invalid-name,unused-argument
-  def Count(label, pcoll, factor=1):
+  # pylint: disable=invalid-name
+  def Count(pcoll, factor=1):
 """Count as a decorated function with a side input.
 
 Args:
-  label: optional label for this transform
   pcoll: the PCollection passed in from the previous transform
   factor: the amount by which to count
 

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/31b3f00c/sdks/python/apache_beam/transforms/combiners.py
--
diff --git a/sdks/python/apache_beam/transforms/combiners.py 
b/sdks/python/apache_beam/transforms/combiners.py
index e9f11a0..8c56e5a 100644
--- a/sdks/python/apache_beam/transforms/combiners.py
+++ b/sdks/python/apache_beam/transforms/combiners.py
@@ -148,7 +148,7 @@ class Top(object):
   # pylint: disable=no-self-argument
 
   @ptransform.ptransform_fn
-  def Of(label, pcoll, n, compare, *args, **kwargs):
+  def Of(pcoll, n, compare, *args, **kwargs):
 """Obtain a list of the compare-most N elements in a PCollection.
 
 This transform will retrieve the n greatest elements in the PCollection
@@ -160,7 +160,6 @@ class Top(object):
 become additional arguments to the comparator.
 
 Args:
-  label: display label for transform processes.
   pcoll: PCollection to process.
   n: number of elements to extract from pcoll.
   compare: as described above.
@@ -168,10 +167,10 @@ class Top(object):
   **kwargs: as described above.
 """
 return pcoll | core.CombineGlobally(
-label, TopCombineFn(n, compare), *args, **kwargs)
+TopCombineFn(n, compare), *args, **kwargs)
 
   @ptransform.ptransform_fn
-  def PerKey(label, pcoll, n, compare, *args, **kwargs):
+  def PerKey(pcoll, n, compare, *args, **kwargs):
 """Identifies the compare-most N elements associated with each key.
 
 This transform will produce a PCollection mapping unique keys in the input
@@ -184,7 +183,6 @@ class Top(object):
 become additional arguments to the comparator.
 
 Args:
-  label: display label for transform processes.
   pcoll: PCollection to process.
   n: number of elements to extract from pcoll.
   compare: as described above.
@@ -196,27 +194,27 @@ class Top(object):
 compatible with KV[A, B].
 """
 return pcoll | core.CombinePerKey(
-label, TopCombineFn(n, compare), *args, **kwargs)
+TopCombineFn(n, compare), *args, **kwargs)
 
   @ptransform.ptransform_fn
-  def Largest(label, pcoll, n):
+  def Largest(pcoll, n):
 """Obtain a list of the greatest N elements in a PCollection."""
-

[2/2] incubator-beam git commit: pipeline.options should never be None

2016-07-07 Thread robertwb
pipeline.options should never be None

Also fix a typehints error this exposed.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/87961e4b
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/87961e4b
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/87961e4b

Branch: refs/heads/python-sdk
Commit: 87961e4b5c489b1c973be3ce66cb66e1ab886228
Parents: 6b06e3e
Author: Robert Bradshaw 
Authored: Wed Jul 6 16:36:50 2016 -0700
Committer: Robert Bradshaw 
Committed: Thu Jul 7 11:47:16 2016 -0700

--
 sdks/python/apache_beam/pipeline.py | 23 
 sdks/python/apache_beam/pvalue_test.py  |  3 ++-
 sdks/python/apache_beam/transforms/core.py  |  1 -
 sdks/python/apache_beam/typehints/typehints.py  |  5 -
 .../apache_beam/typehints/typehints_test.py |  3 +++
 5 files changed, 18 insertions(+), 17 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/87961e4b/sdks/python/apache_beam/pipeline.py
--
diff --git a/sdks/python/apache_beam/pipeline.py 
b/sdks/python/apache_beam/pipeline.py
index ee83614..a84cec3 100644
--- a/sdks/python/apache_beam/pipeline.py
+++ b/sdks/python/apache_beam/pipeline.py
@@ -106,9 +106,9 @@ class Pipeline(object):
 raise ValueError(
 'Parameter argv, if specified, must be a list. Received : %r', 
argv)
 else:
-  self.options = None
+  self.options = PipelineOptions([])
 
-if runner is None and self.options is not None:
+if runner is None:
   runner = self.options.view_as(StandardOptions).runner
   if runner is None:
 runner = StandardOptions.DEFAULT_RUNNER
@@ -122,11 +122,10 @@ class Pipeline(object):
   'name of a registered runner.')
 
 # Validate pipeline options
-if self.options is not None:
-  errors = PipelineOptionsValidator(self.options, runner).validate()
-  if errors:
-raise ValueError(
-'Pipeline has validations errors: \n' + '\n'.join(errors))
+errors = PipelineOptionsValidator(self.options, runner).validate()
+if errors:
+  raise ValueError(
+  'Pipeline has validations errors: \n' + '\n'.join(errors))
 
 # Default runner to be used.
 self.runner = runner
@@ -151,7 +150,7 @@ class Pipeline(object):
 
   def run(self):
 """Runs the pipeline. Returns whatever our runner returns after running."""
-if not self.options or 
self.options.view_as(SetupOptions).save_main_session:
+if self.options.view_as(SetupOptions).save_main_session:
   # If this option is chosen, verify we can pickle the main session early.
   tmpdir = tempfile.mkdtemp()
   try:
@@ -226,12 +225,8 @@ class Pipeline(object):
 self._current_transform().add_part(current)
 self.transforms_stack.append(current)
 
-if self.options is not None:
-  type_options = self.options.view_as(TypeOptions)
-else:
-  type_options = None
-
-if type_options is not None and type_options.pipeline_type_check:
+type_options = self.options.view_as(TypeOptions)
+if type_options.pipeline_type_check:
   transform.type_check_inputs(pvalueish)
 
 pvalueish_result = self.runner.apply(transform, pvalueish)

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/87961e4b/sdks/python/apache_beam/pvalue_test.py
--
diff --git a/sdks/python/apache_beam/pvalue_test.py 
b/sdks/python/apache_beam/pvalue_test.py
index ef7e5f5..bb742e0 100644
--- a/sdks/python/apache_beam/pvalue_test.py
+++ b/sdks/python/apache_beam/pvalue_test.py
@@ -47,6 +47,7 @@ class PValueTest(unittest.TestCase):
 pipeline = Pipeline('DirectPipelineRunner')
 value = pipeline | Create('create1', [1, 2, 3])
 value2 = pipeline | Create('create2', [(1, 1), (2, 2), (3, 3)])
+value3 = pipeline | Create('create3', [(1, 1), (2, 2), (3, 3)])
 self.assertEqual(AsSingleton(value), AsSingleton(value))
 self.assertEqual(AsSingleton('new', value, default_value=1),
  AsSingleton('new', value, default_value=1))
@@ -59,7 +60,7 @@ class PValueTest(unittest.TestCase):
 self.assertNotEqual(AsSingleton(value), AsSingleton(value2))
 self.assertNotEqual(AsIter(value), AsIter(value2))
 self.assertNotEqual(AsList(value), AsList(value2))
-self.assertNotEqual(AsDict(value), AsDict(value2))
+self.assertNotEqual(AsDict(value2), AsDict(value3))
 
 
 if __name__ == '__main__':

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/87961e4b/sdks/python/apache_beam/transforms/core.py
--
diff 

[GitHub] incubator-beam pull request #597: pipeline.options should never be None

2016-07-07 Thread robertwb
Github user robertwb closed the pull request at:

https://github.com/apache/incubator-beam/pull/597


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] incubator-beam git commit: Closes #597

2016-07-07 Thread robertwb
Repository: incubator-beam
Updated Branches:
  refs/heads/python-sdk 6b06e3e22 -> 342d2d798


Closes #597


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/342d2d79
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/342d2d79
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/342d2d79

Branch: refs/heads/python-sdk
Commit: 342d2d79829ed9da9b8066a4ff0a4a02992188d6
Parents: 6b06e3e 87961e4
Author: Robert Bradshaw 
Authored: Thu Jul 7 11:47:16 2016 -0700
Committer: Robert Bradshaw 
Committed: Thu Jul 7 11:47:16 2016 -0700

--
 sdks/python/apache_beam/pipeline.py | 23 
 sdks/python/apache_beam/pvalue_test.py  |  3 ++-
 sdks/python/apache_beam/transforms/core.py  |  1 -
 sdks/python/apache_beam/typehints/typehints.py  |  5 -
 .../apache_beam/typehints/typehints_test.py |  3 +++
 5 files changed, 18 insertions(+), 17 deletions(-)
--




[GitHub] incubator-beam pull request #603: Modified addBulkOptions for simplicity

2016-07-07 Thread ianzhou1
GitHub user ianzhou1 opened a pull request:

https://github.com/apache/incubator-beam/pull/603

Modified addBulkOptions for simplicity



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ianzhou1/incubator-beam AddBulkOptions

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/603.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #603


commit e0d197638d24cadfa38afe39850ccae8861add67
Author: Ian Zhou 
Date:   2016-07-07T17:27:47Z

Modified addBulkOptions for simplicity




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-beam pull request #593: Run lint on python/sdks folder to include ...

2016-07-07 Thread aaltay
Github user aaltay closed the pull request at:

https://github.com/apache/incubator-beam/pull/593


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] incubator-beam git commit: Enable linter rules no-self-argument, reimported, ungrouped-imports

2016-07-07 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/python-sdk 253497655 -> 6b06e3e22


Enable linter rules no-self-argument, reimported, ungrouped-imports

And run lint on python/sdks folder to include setup.py.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/2b14fe5d
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/2b14fe5d
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/2b14fe5d

Branch: refs/heads/python-sdk
Commit: 2b14fe5dd4d289599bc902a39392a7956a4d59c9
Parents: 2534976
Author: Ahmet Altay 
Authored: Wed Jul 6 12:41:19 2016 -0700
Committer: Dan Halperin 
Committed: Thu Jul 7 09:49:36 2016 -0700

--
 sdks/python/.pylintrc  | 3 ---
 sdks/python/apache_beam/examples/snippets/snippets.py  | 1 +
 sdks/python/apache_beam/examples/snippets/snippets_test.py | 7 ---
 sdks/python/apache_beam/transforms/combiners_test.py   | 9 -
 sdks/python/apache_beam/transforms/trigger.py  | 3 ++-
 sdks/python/run_pylint.sh  | 2 +-
 sdks/python/setup.py   | 5 ++---
 7 files changed, 14 insertions(+), 16 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/2b14fe5d/sdks/python/.pylintrc
--
diff --git a/sdks/python/.pylintrc b/sdks/python/.pylintrc
index c138f1a..c69fd2b 100644
--- a/sdks/python/.pylintrc
+++ b/sdks/python/.pylintrc
@@ -101,7 +101,6 @@ disable =
   multiple-statements,
   no-member,
   no-name-in-module,
-  no-self-argument,
   no-self-use,
   no-value-for-parameter,
   not-callable,
@@ -112,14 +111,12 @@ disable =
   redefined-outer-name,
   redefined-variable-type,
   redundant-keyword-arg,
-  reimported,
   relative-import,
   similarities,
   simplifiable-if-statement,
   super-init-not-called,
   undefined-variable,
   unexpected-keyword-arg,
-  ungrouped-imports,
   unidiomatic-typecheck,
   unnecessary-lambda,
   unneeded-not,

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/2b14fe5d/sdks/python/apache_beam/examples/snippets/snippets.py
--
diff --git a/sdks/python/apache_beam/examples/snippets/snippets.py 
b/sdks/python/apache_beam/examples/snippets/snippets.py
index f5bbc66..d84deea 100644
--- a/sdks/python/apache_beam/examples/snippets/snippets.py
+++ b/sdks/python/apache_beam/examples/snippets/snippets.py
@@ -39,6 +39,7 @@ import apache_beam as beam
 # pylint:disable=invalid-name
 # pylint:disable=expression-not-assigned
 # pylint:disable=redefined-outer-name
+# pylint:disable=reimported
 # pylint:disable=unused-variable
 # pylint:disable=wrong-import-order, wrong-import-position
 

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/2b14fe5d/sdks/python/apache_beam/examples/snippets/snippets_test.py
--
diff --git a/sdks/python/apache_beam/examples/snippets/snippets_test.py 
b/sdks/python/apache_beam/examples/snippets/snippets_test.py
index 87ce266..6e1045f 100644
--- a/sdks/python/apache_beam/examples/snippets/snippets_test.py
+++ b/sdks/python/apache_beam/examples/snippets/snippets_test.py
@@ -26,10 +26,11 @@ import apache_beam as beam
 from apache_beam import io
 from apache_beam import pvalue
 from apache_beam import typehints
-from apache_beam.examples.snippets import snippets
 from apache_beam.io import fileio
 from apache_beam.utils.options import TypeOptions
+from apache_beam.examples.snippets import snippets
 
+# pylint: disable=expression-not-assigned
 
 # Monky-patch to use native sink for file path re-writing.
 io.TextFileSink = fileio.NativeTextFileSink
@@ -226,6 +227,7 @@ class TypeHintsTest(unittest.TestCase):
 
   def test_bad_types(self):
 p = beam.Pipeline('DirectPipelineRunner', argv=sys.argv)
+evens = None  # pylint: disable=unused-variable
 
 # [START type_hints_missing_define_numbers]
 numbers = p | beam.Create(['1', '2', '3'])
@@ -236,7 +238,7 @@ class TypeHintsTest(unittest.TestCase):
 evens = numbers | beam.Filter(lambda x: x % 2 == 0)
 # [END type_hints_missing_apply]
 
-# Now suppose numers was defined as [snippet above].
+# Now suppose numbers was defined as [snippet above].
 # When running this pipeline, you'd get a runtime error,
 # possibly on a remote machine, possibly very late.
 
@@ -298,7 +300,6 @@ class TypeHintsTest(unittest.TestCase):
   # [END type_hints_runtime_on]
 
   def test_deterministic_key(self):
-p = beam.Pipeline('DirectPipelineRunner', argv=sys.argv)
 lines = ['banana,fruit,3', 'kiwi,fruit,2', 'kiwi,fruit,2', 

[2/2] incubator-beam git commit: Closes #593

2016-07-07 Thread dhalperi
Closes #593


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/6b06e3e2
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/6b06e3e2
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/6b06e3e2

Branch: refs/heads/python-sdk
Commit: 6b06e3e220cb25b4aa4fe4ab3647e19d0c257f2a
Parents: 2534976 2b14fe5
Author: Dan Halperin 
Authored: Thu Jul 7 09:49:43 2016 -0700
Committer: Dan Halperin 
Committed: Thu Jul 7 09:49:43 2016 -0700

--
 sdks/python/.pylintrc  | 3 ---
 sdks/python/apache_beam/examples/snippets/snippets.py  | 1 +
 sdks/python/apache_beam/examples/snippets/snippets_test.py | 7 ---
 sdks/python/apache_beam/transforms/combiners_test.py   | 9 -
 sdks/python/apache_beam/transforms/trigger.py  | 3 ++-
 sdks/python/run_pylint.sh  | 2 +-
 sdks/python/setup.py   | 5 ++---
 7 files changed, 14 insertions(+), 16 deletions(-)
--




[GitHub] incubator-beam pull request #602: Fix BigtableIO display data label

2016-07-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/602


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] incubator-beam git commit: Fix BigtableIO display data label

2016-07-07 Thread dhalperi
Fix BigtableIO display data label


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/9c91403c
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/9c91403c
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/9c91403c

Branch: refs/heads/master
Commit: 9c91403c7b0b7e45473b30be8019a8e07c66fd1c
Parents: 33b18b1
Author: Scott Wegner 
Authored: Thu Jul 7 08:38:14 2016 -0700
Committer: Dan Halperin 
Committed: Thu Jul 7 09:32:23 2016 -0700

--
 .../transforms/display/DisplayDataMatchers.java | 22 
 .../sdk/transforms/display/DisplayDataTest.java | 13 ++--
 .../beam/sdk/io/gcp/bigtable/BigtableIO.java|  2 +-
 .../sdk/io/gcp/bigtable/BigtableIOTest.java | 10 -
 4 files changed, 34 insertions(+), 13 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/9c91403c/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/display/DisplayDataMatchers.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/display/DisplayDataMatchers.java
 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/display/DisplayDataMatchers.java
index 23cffd4..025a1f7 100644
--- 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/display/DisplayDataMatchers.java
+++ 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/display/DisplayDataMatchers.java
@@ -360,4 +360,26 @@ public class DisplayDataMatchers {
   }
 };
   }
+
+  /**
+   * Creates a matcher that matches if the examined {@link DisplayData.Item} 
has the specified
+   * label.
+   */
+  public static Matcher hasLabel(String label) {
+return hasLabel(Matchers.is(label));
+  }
+
+  /**
+   * Creates a matcher that matches if the examined {@link DisplayData.Item} 
has a label matching
+   * the specified label matcher.
+   */
+  public static Matcher hasLabel(Matcher 
labelMatcher) {
+return new FeatureMatcher(
+labelMatcher, "display item with label", "label") {
+  @Override
+  protected String featureValueOf(DisplayData.Item actual) {
+return actual.getLabel();
+  }
+};
+  }
 }

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/9c91403c/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/display/DisplayDataTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/display/DisplayDataTest.java
 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/display/DisplayDataTest.java
index a1189bb..cafe873 100644
--- 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/display/DisplayDataTest.java
+++ 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/display/DisplayDataTest.java
@@ -19,6 +19,7 @@ package org.apache.beam.sdk.transforms.display;
 
 import static 
org.apache.beam.sdk.transforms.display.DisplayDataMatchers.hasDisplayItem;
 import static 
org.apache.beam.sdk.transforms.display.DisplayDataMatchers.hasKey;
+import static 
org.apache.beam.sdk.transforms.display.DisplayDataMatchers.hasLabel;
 import static 
org.apache.beam.sdk.transforms.display.DisplayDataMatchers.hasNamespace;
 import static 
org.apache.beam.sdk.transforms.display.DisplayDataMatchers.hasType;
 import static 
org.apache.beam.sdk.transforms.display.DisplayDataMatchers.hasValue;
@@ -206,7 +207,7 @@ public class DisplayDataTest implements Serializable {
 hasType(DisplayData.Type.TIMESTAMP),
 hasValue(ISO_FORMATTER.print(value)),
 hasShortValue(nullValue(String.class)),
-hasLabel(is("the current instant")),
+hasLabel("the current instant"),
 hasUrl(is("http://time.gov;)));
 
 assertThat(item, matchesAllOf);
@@ -1104,16 +1105,6 @@ public class DisplayDataTest implements Serializable {
 return hasItem(jsonNode);
   }
 
-  private static Matcher hasLabel(Matcher 
labelMatcher) {
-return new FeatureMatcher(
-labelMatcher, "display item with label", "label") {
-  @Override
-  protected String featureValueOf(DisplayData.Item actual) {
-return actual.getLabel();
-  }
-};
-  }
-
   private static Matcher hasUrl(Matcher 
urlMatcher) {
 return new FeatureMatcher(
 urlMatcher, "display item with url", "URL") {

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/9c91403c/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java

[1/2] incubator-beam git commit: Closes #602

2016-07-07 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/master 33b18b173 -> 1963bde44


Closes #602


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/1963bde4
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/1963bde4
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/1963bde4

Branch: refs/heads/master
Commit: 1963bde446a798f0971393bb8f0076fc933367cf
Parents: 33b18b1 9c91403
Author: Dan Halperin 
Authored: Thu Jul 7 09:32:23 2016 -0700
Committer: Dan Halperin 
Committed: Thu Jul 7 09:32:23 2016 -0700

--
 .../transforms/display/DisplayDataMatchers.java | 22 
 .../sdk/transforms/display/DisplayDataTest.java | 13 ++--
 .../beam/sdk/io/gcp/bigtable/BigtableIO.java|  2 +-
 .../sdk/io/gcp/bigtable/BigtableIOTest.java | 10 -
 4 files changed, 34 insertions(+), 13 deletions(-)
--




[jira] [Commented] (BEAM-403) Support staging SDK packages from PyPI for remote execution

2016-07-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366324#comment-15366324
 ] 

ASF GitHub Bot commented on BEAM-403:
-

Github user silviulica closed the pull request at:

https://github.com/apache/incubator-beam/pull/569


> Support staging SDK packages from PyPI for remote execution
> ---
>
> Key: BEAM-403
> URL: https://issues.apache.org/jira/browse/BEAM-403
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py
>Reporter: Silviu Calinoiu
>Assignee: Silviu Calinoiu
>Priority: Minor
> Fix For: 0.2.0-incubating
>
>
> Currently the dataflow runner will pickup the SDK tarball from the old github 
> repo and stage it. We need to pick it up from PyPI (where packages will be 
> released) and stage it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #569: [BEAM-403] Get current SDK package from Py...

2016-07-07 Thread silviulica
Github user silviulica closed the pull request at:

https://github.com/apache/incubator-beam/pull/569


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-beam pull request #602: Fix BigtableIO display data label

2016-07-07 Thread swegner
GitHub user swegner opened a pull request:

https://github.com/apache/incubator-beam/pull/602

Fix BigtableIO display data label

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/swegner/incubator-beam bigtable-displaydata

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/602.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #602


commit 064e4c00ad410b041098a918ceb3983d668f1c7a
Author: Scott Wegner 
Date:   2016-07-07T15:38:14Z

Fix BigtableIO display data label




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-beam pull request #601: Documentation URL provided previously thro...

2016-07-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/601


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] incubator-beam git commit: Closes #601

2016-07-07 Thread dhalperi
Closes #601


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/33b18b17
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/33b18b17
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/33b18b17

Branch: refs/heads/master
Commit: 33b18b173c8132435f6051827cdc1355693205ff
Parents: 66d726a b90f71a
Author: Dan Halperin 
Authored: Thu Jul 7 08:35:17 2016 -0700
Committer: Dan Halperin 
Committed: Thu Jul 7 08:35:17 2016 -0700

--
 README.md | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)
--




[jira] [Resolved] (BEAM-408) ProxyInvocationHandler uses inefficient Math.random() for random int

2016-07-07 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-408.

   Resolution: Fixed
Fix Version/s: 0.2.0-incubating

> ProxyInvocationHandler uses inefficient Math.random() for random int
> 
>
> Key: BEAM-408
> URL: https://issues.apache.org/jira/browse/BEAM-408
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Lucas Amorim
>Priority: Minor
>  Labels: findbugs, newbie, starter
> Fix For: 0.2.0-incubating
>
>
> [FindBugs 
> DM_NEXTINT_VIA_NEXTDOUBLE|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml#L209]:
>  Use the nextInt method of Random rather than nextDouble to generate a random 
> integer
> Applies to: 
> [ProxyInvocationHandler.hashCode|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java#L96].
>  This random value is used as a pseudo-unique value for #hashCode(), although 
> the default Object.hasCode() implementation does the same thing, so this 
> should be unnecessary.
> This is a good starter bug. When fixing, please remove the corresponding 
> entries from 
> [findbugs-filter.xml|https://github.com/apache/incubator-beam/blob/master/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml]
>  and verify the build passes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-408) ProxyInvocationHandler uses inefficient Math.random() for random int

2016-07-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366127#comment-15366127
 ] 

ASF GitHub Bot commented on BEAM-408:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/588


> ProxyInvocationHandler uses inefficient Math.random() for random int
> 
>
> Key: BEAM-408
> URL: https://issues.apache.org/jira/browse/BEAM-408
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Lucas Amorim
>Priority: Minor
>  Labels: findbugs, newbie, starter
>
> [FindBugs 
> DM_NEXTINT_VIA_NEXTDOUBLE|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml#L209]:
>  Use the nextInt method of Random rather than nextDouble to generate a random 
> integer
> Applies to: 
> [ProxyInvocationHandler.hashCode|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java#L96].
>  This random value is used as a pseudo-unique value for #hashCode(), although 
> the default Object.hasCode() implementation does the same thing, so this 
> should be unnecessary.
> This is a good starter bug. When fixing, please remove the corresponding 
> entries from 
> [findbugs-filter.xml|https://github.com/apache/incubator-beam/blob/master/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml]
>  and verify the build passes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/4] incubator-beam git commit: Removes extra blank space

2016-07-07 Thread lcwik
Repository: incubator-beam
Updated Branches:
  refs/heads/master 1cb898f58 -> 66d726aa2


Removes extra blank space


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/7e97cbcc
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/7e97cbcc
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/7e97cbcc

Branch: refs/heads/master
Commit: 7e97cbcc8a6ec47bcfad17054e581a45dc248ac2
Parents: 9df033c
Author: Lucas Amorim 
Authored: Wed Jul 6 23:11:55 2016 -0700
Committer: Luke Cwik 
Committed: Thu Jul 7 09:07:15 2016 -0400

--
 .../org/apache/beam/sdk/options/ProxyInvocationHandler.java| 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/7e97cbcc/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java
index cb69979..fe67f16 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java
@@ -91,11 +91,9 @@ import javax.annotation.concurrent.ThreadSafe;
 class ProxyInvocationHandler implements InvocationHandler, HasDisplayData {
   private static final ObjectMapper MAPPER = new ObjectMapper();
   /**
-   * No two instances of this class are considered equivalent hence we 
generate a random hash code
-   * between 0 and {@link Integer#MAX_VALUE}.
+   * No two instances of this class are considered equivalent hence we 
generate a random hash code.
*/
-  private final int hashCode =  ThreadLocalRandom.current().nextInt(
-  Integer.MIN_VALUE, Integer.MAX_VALUE);
+  private final int hashCode = ThreadLocalRandom.current().nextInt();
   private final Set knownInterfaces;
   private final ClassToInstanceMap interfaceToProxyCache;
   private final Map options;



[GitHub] incubator-beam pull request #588: [BEAM-408] - Fixes ProxyInvocationHandler ...

2016-07-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/588


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[4/4] incubator-beam git commit: [BEAM-408] - Fixes ProxyInvocationHandler uses inefficient Math.random() for random int

2016-07-07 Thread lcwik
[BEAM-408] - Fixes ProxyInvocationHandler uses inefficient Math.random() for 
random int

This closes #588


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/66d726aa
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/66d726aa
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/66d726aa

Branch: refs/heads/master
Commit: 66d726aa2aa56f11f2ac91771941d84e64fb4e4c
Parents: 1cb898f 7e97cbc
Author: Luke Cwik 
Authored: Thu Jul 7 09:07:30 2016 -0400
Committer: Luke Cwik 
Committed: Thu Jul 7 09:07:30 2016 -0400

--
 .../build-tools/src/main/resources/beam/findbugs-filter.xml| 6 --
 .../org/apache/beam/sdk/options/ProxyInvocationHandler.java| 6 +++---
 2 files changed, 3 insertions(+), 9 deletions(-)
--




[3/4] incubator-beam git commit: [BEAM-408] - Fixes ProxyInvocationHandler uses inefficient Math.random() for random int

2016-07-07 Thread lcwik
[BEAM-408] - Fixes ProxyInvocationHandler uses inefficient Math.random() for 
random int

For more information: https://issues.apache.org/jira/browse/BEAM-408


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/708244bf
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/708244bf
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/708244bf

Branch: refs/heads/master
Commit: 708244bf91b4079690bbaed0c063b70e76a95794
Parents: 1cb898f
Author: Lucas Amorim 
Authored: Mon Jul 4 15:47:39 2016 -0700
Committer: Luke Cwik 
Committed: Thu Jul 7 09:07:15 2016 -0400

--
 .../build-tools/src/main/resources/beam/findbugs-filter.xml| 6 --
 .../org/apache/beam/sdk/options/ProxyInvocationHandler.java| 4 +++-
 2 files changed, 3 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/708244bf/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml
--
diff --git a/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml 
b/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml
index a1f0e8a..d151315 100644
--- a/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml
+++ b/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml
@@ -194,12 +194,6 @@
 
   
   
-
-
-
-
-  
-  
 
 
 

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/708244bf/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java
index e3d763b..0c10c2f 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java
@@ -66,6 +66,7 @@ import java.util.HashSet;
 import java.util.Iterator;
 import java.util.List;
 import java.util.Map;
+import java.util.Random;
 import java.util.Set;
 import java.util.SortedMap;
 import java.util.TreeMap;
@@ -93,7 +94,8 @@ class ProxyInvocationHandler implements InvocationHandler, 
HasDisplayData {
* No two instances of this class are considered equivalent hence we 
generate a random hash code
* between 0 and {@link Integer#MAX_VALUE}.
*/
-  private final int hashCode = (int) (Math.random() * Integer.MAX_VALUE);
+  private final int hashCode = (RANDOM.nextInt() * Integer.MAX_VALUE);
+  private static final Random RANDOM = new Random();
   private final Set knownInterfaces;
   private final ClassToInstanceMap interfaceToProxyCache;
   private final Map options;



[GitHub] incubator-beam pull request #585: Merge master into runners-spark2 branch to...

2016-07-07 Thread amitsela
Github user amitsela closed the pull request at:

https://github.com/apache/incubator-beam/pull/585


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-381) OffsetBasedReader should construct sources before updating the range tracker

2016-07-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365666#comment-15365666
 ] 

ASF GitHub Bot commented on BEAM-381:
-

GitHub user dhalperi opened a pull request:

https://github.com/apache/incubator-beam/pull/600

[BEAM-381] BoundedReader: update the range last of all

Reorders the code in some splitAtFraction calls so that the rangeTracker 
update
is the last thing (besides assignment) in the function. This avoids a 
potential
issue if creating the primary or residual sources happens to throw an 
exception.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/incubator-beam offsetbasedreader

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/600.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #600


commit 489432a3ce31aa3c769834920841c7a1df298720
Author: Dan Halperin 
Date:   2016-07-07T06:05:56Z

[BEAM-381] BoundedReader: update the range last of all

Reorders the code in some splitAtFraction calls so that the rangeTracker 
update
is the last thing (besides assignment) in the function. This avoids a 
potential
issue if creating the primary or residual sources happens to throw an 
exception.




> OffsetBasedReader should construct sources before updating the range tracker
> 
>
> Key: BEAM-381
> URL: https://issues.apache.org/jira/browse/BEAM-381
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 0.1.0-incubating, 0.2.0-incubating
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
>
> OffsetBasedReader has the following code:
> {code}
>   if (!rangeTracker.trySplitAtPosition(splitOffset)) {
> return null;
>   }
>   long start = source.getStartOffset();
>   long end = source.getEndOffset();
>   OffsetBasedSource primary = source.createSourceForSubrange(start, 
> splitOffset);
>   OffsetBasedSource residual = 
> source.createSourceForSubrange(splitOffset, end);
>   this.source = primary;
>   return residual;
> {code}
> The first line is the line that updates the range of this source. However, 
> subsequent lines might throw (specifically, in 
> source.createSourceForSubrange). We should construct the sources first, and 
> then catch exceptions and return null if they fail. This way, the 
> splitAtFraction call will not throw (so work is not wasted) and the range 
> tracker will not be updated if either the primary or (more likely) the 
> residual could not be created.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #600: [BEAM-381] BoundedReader: update the range...

2016-07-07 Thread dhalperi
GitHub user dhalperi opened a pull request:

https://github.com/apache/incubator-beam/pull/600

[BEAM-381] BoundedReader: update the range last of all

Reorders the code in some splitAtFraction calls so that the rangeTracker 
update
is the last thing (besides assignment) in the function. This avoids a 
potential
issue if creating the primary or residual sources happens to throw an 
exception.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/incubator-beam offsetbasedreader

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/600.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #600


commit 489432a3ce31aa3c769834920841c7a1df298720
Author: Dan Halperin 
Date:   2016-07-07T06:05:56Z

[BEAM-381] BoundedReader: update the range last of all

Reorders the code in some splitAtFraction calls so that the rangeTracker 
update
is the last thing (besides assignment) in the function. This avoids a 
potential
issue if creating the primary or residual sources happens to throw an 
exception.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-beam pull request #601: Documentation URL provided previously thro...

2016-07-07 Thread SrinikhilReddy
GitHub user SrinikhilReddy opened a pull request:

https://github.com/apache/incubator-beam/pull/601

Documentation URL provided previously throws Error



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/SrinikhilReddy/incubator-beam master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/601.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #601


commit 773f0fd1696627a0991682042a765813a9000af9
Author: srinikhil 
Date:   2016-07-07T06:06:54Z

Change In  Documentation URL




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---