[jira] [Closed] (BEAM-381) OffsetBasedReader should construct sources before updating the range tracker
[ https://issues.apache.org/jira/browse/BEAM-381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Halperin closed BEAM-381. > OffsetBasedReader should construct sources before updating the range tracker > > > Key: BEAM-381 > URL: https://issues.apache.org/jira/browse/BEAM-381 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 0.1.0-incubating, 0.2.0-incubating >Reporter: Daniel Halperin >Assignee: Daniel Halperin > Fix For: 0.2.0-incubating > > > OffsetBasedReader has the following code: > {code} > if (!rangeTracker.trySplitAtPosition(splitOffset)) { > return null; > } > long start = source.getStartOffset(); > long end = source.getEndOffset(); > OffsetBasedSource primary = source.createSourceForSubrange(start, > splitOffset); > OffsetBasedSource residual = > source.createSourceForSubrange(splitOffset, end); > this.source = primary; > return residual; > {code} > The first line is the line that updates the range of this source. However, > subsequent lines might throw (specifically, in > source.createSourceForSubrange). We should construct the sources first, and > then catch exceptions and return null if they fail. This way, the > splitAtFraction call will not throw (so work is not wasted) and the range > tracker will not be updated if either the primary or (more likely) the > residual could not be created. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-381) OffsetBasedReader should construct sources before updating the range tracker
[ https://issues.apache.org/jira/browse/BEAM-381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15367239#comment-15367239 ] ASF GitHub Bot commented on BEAM-381: - Github user asfgit closed the pull request at: https://github.com/apache/incubator-beam/pull/600 > OffsetBasedReader should construct sources before updating the range tracker > > > Key: BEAM-381 > URL: https://issues.apache.org/jira/browse/BEAM-381 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 0.1.0-incubating, 0.2.0-incubating >Reporter: Daniel Halperin >Assignee: Daniel Halperin > Fix For: 0.2.0-incubating > > > OffsetBasedReader has the following code: > {code} > if (!rangeTracker.trySplitAtPosition(splitOffset)) { > return null; > } > long start = source.getStartOffset(); > long end = source.getEndOffset(); > OffsetBasedSource primary = source.createSourceForSubrange(start, > splitOffset); > OffsetBasedSource residual = > source.createSourceForSubrange(splitOffset, end); > this.source = primary; > return residual; > {code} > The first line is the line that updates the range of this source. However, > subsequent lines might throw (specifically, in > source.createSourceForSubrange). We should construct the sources first, and > then catch exceptions and return null if they fail. This way, the > splitAtFraction call will not throw (so work is not wasted) and the range > tracker will not be updated if either the primary or (more likely) the > residual could not be created. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-381) OffsetBasedReader should construct sources before updating the range tracker
[ https://issues.apache.org/jira/browse/BEAM-381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Halperin resolved BEAM-381. -- Resolution: Fixed > OffsetBasedReader should construct sources before updating the range tracker > > > Key: BEAM-381 > URL: https://issues.apache.org/jira/browse/BEAM-381 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 0.1.0-incubating, 0.2.0-incubating >Reporter: Daniel Halperin >Assignee: Daniel Halperin > Fix For: 0.2.0-incubating > > > OffsetBasedReader has the following code: > {code} > if (!rangeTracker.trySplitAtPosition(splitOffset)) { > return null; > } > long start = source.getStartOffset(); > long end = source.getEndOffset(); > OffsetBasedSource primary = source.createSourceForSubrange(start, > splitOffset); > OffsetBasedSource residual = > source.createSourceForSubrange(splitOffset, end); > this.source = primary; > return residual; > {code} > The first line is the line that updates the range of this source. However, > subsequent lines might throw (specifically, in > source.createSourceForSubrange). We should construct the sources first, and > then catch exceptions and return null if they fail. This way, the > splitAtFraction call will not throw (so work is not wasted) and the range > tracker will not be updated if either the primary or (more likely) the > residual could not be created. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[1/2] incubator-beam git commit: Closes #600
Repository: incubator-beam Updated Branches: refs/heads/master 921c55c94 -> 74e1f83df Closes #600 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/74e1f83d Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/74e1f83d Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/74e1f83d Branch: refs/heads/master Commit: 74e1f83df651af61a98c388604d6cdd4f75d0ff5 Parents: 921c55c 543842c Author: Dan HalperinAuthored: Thu Jul 7 22:20:03 2016 -0700 Committer: Dan Halperin Committed: Thu Jul 7 22:20:03 2016 -0700 -- .../main/java/org/apache/beam/sdk/io/OffsetBasedSource.java| 6 +++--- .../java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java | 4 ++-- 2 files changed, 5 insertions(+), 5 deletions(-) --
[jira] [Updated] (BEAM-381) OffsetBasedReader should construct sources before updating the range tracker
[ https://issues.apache.org/jira/browse/BEAM-381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Halperin updated BEAM-381: - Fix Version/s: 0.2.0-incubating > OffsetBasedReader should construct sources before updating the range tracker > > > Key: BEAM-381 > URL: https://issues.apache.org/jira/browse/BEAM-381 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 0.1.0-incubating, 0.2.0-incubating >Reporter: Daniel Halperin >Assignee: Daniel Halperin > Fix For: 0.2.0-incubating > > > OffsetBasedReader has the following code: > {code} > if (!rangeTracker.trySplitAtPosition(splitOffset)) { > return null; > } > long start = source.getStartOffset(); > long end = source.getEndOffset(); > OffsetBasedSource primary = source.createSourceForSubrange(start, > splitOffset); > OffsetBasedSource residual = > source.createSourceForSubrange(splitOffset, end); > this.source = primary; > return residual; > {code} > The first line is the line that updates the range of this source. However, > subsequent lines might throw (specifically, in > source.createSourceForSubrange). We should construct the sources first, and > then catch exceptions and return null if they fail. This way, the > splitAtFraction call will not throw (so work is not wasted) and the range > tracker will not be updated if either the primary or (more likely) the > residual could not be created. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-beam pull request #600: [BEAM-381] BoundedReader: update the range...
Github user asfgit closed the pull request at: https://github.com/apache/incubator-beam/pull/600 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[2/2] incubator-beam git commit: [BEAM-381] BoundedReader: update the range last of all
[BEAM-381] BoundedReader: update the range last of all Reorders the code in some splitAtFraction calls so that the rangeTracker update is the last thing (besides assignment) in the function. This avoids a potential issue if creating the primary or residual sources happens to throw an exception. Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/543842cb Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/543842cb Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/543842cb Branch: refs/heads/master Commit: 543842cbd9d433bcef9b7962d9c71a8779e99eb5 Parents: 921c55c Author: Dan HalperinAuthored: Wed Jul 6 23:05:56 2016 -0700 Committer: Dan Halperin Committed: Thu Jul 7 22:20:03 2016 -0700 -- .../main/java/org/apache/beam/sdk/io/OffsetBasedSource.java| 6 +++--- .../java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java | 4 ++-- 2 files changed, 5 insertions(+), 5 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/543842cb/sdks/java/core/src/main/java/org/apache/beam/sdk/io/OffsetBasedSource.java -- diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/OffsetBasedSource.java b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/OffsetBasedSource.java index d5a6801..8cbcd1f 100644 --- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/OffsetBasedSource.java +++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/OffsetBasedSource.java @@ -370,13 +370,13 @@ public abstract class OffsetBasedSource extends BoundedSource { LOG.debug( "Proposing to split OffsetBasedReader {} at fraction {} (offset {})", rangeTracker, fraction, splitOffset); - if (!rangeTracker.trySplitAtPosition(splitOffset)) { -return null; - } long start = source.getStartOffset(); long end = source.getEndOffset(); OffsetBasedSource primary = source.createSourceForSubrange(start, splitOffset); OffsetBasedSource residual = source.createSourceForSubrange(splitOffset, end); + if (!rangeTracker.trySplitAtPosition(splitOffset)) { +return null; + } this.source = primary; return residual; } http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/543842cb/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java -- diff --git a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java index b4c3c75..0c485bf 100644 --- a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java +++ b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java @@ -861,11 +861,11 @@ public class BigtableIO { } logger.debug( "Proposing to split {} at fraction {} (key {})", rangeTracker, fraction, splitKey); + BigtableSource primary = source.withEndKey(splitKey); + BigtableSource residual = source.withStartKey(splitKey); if (!rangeTracker.trySplitAtPosition(splitKey)) { return null; } - BigtableSource primary = source.withEndKey(splitKey); - BigtableSource residual = source.withStartKey(splitKey); this.source = primary; return residual; }
[2/2] incubator-beam git commit: Closes #608
Closes #608 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/e167d2b5 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/e167d2b5 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/e167d2b5 Branch: refs/heads/python-sdk Commit: e167d2b5a9146f989ec069a53545e933029d7c1a Parents: a580b31 1da908f Author: Dan HalperinAuthored: Thu Jul 7 22:04:21 2016 -0700 Committer: Dan Halperin Committed: Thu Jul 7 22:04:21 2016 -0700 -- sdks/python/run_postcommit.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --
[GitHub] incubator-beam pull request #608: Uncomment tox in the postcommit script.
Github user asfgit closed the pull request at: https://github.com/apache/incubator-beam/pull/608 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[1/2] incubator-beam git commit: Uncomment tox in the postcommit script.
Repository: incubator-beam Updated Branches: refs/heads/python-sdk a580b31ac -> e167d2b5a Uncomment tox in the postcommit script. Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/1da908f7 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/1da908f7 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/1da908f7 Branch: refs/heads/python-sdk Commit: 1da908f749dfd8bc66d8c0bda4b0302cc0f343fa Parents: a580b31 Author: Ahmet AltayAuthored: Thu Jul 7 16:38:52 2016 -0700 Committer: Ahmet Altay Committed: Thu Jul 7 16:38:52 2016 -0700 -- sdks/python/run_postcommit.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/1da908f7/sdks/python/run_postcommit.sh -- diff --git a/sdks/python/run_postcommit.sh b/sdks/python/run_postcommit.sh index 7af4e6c..23dd516 100755 --- a/sdks/python/run_postcommit.sh +++ b/sdks/python/run_postcommit.sh @@ -38,7 +38,7 @@ pip install virtualenv --user pip install tox --user # Tox runs unit tests in a virtual environment -# ${LOCAL_PATH}/tox -e py27 -c sdks/python/tox.ini +${LOCAL_PATH}/tox -e py27 -c sdks/python/tox.ini # Virtualenv for the rest of the script to run setup & e2e tests ${LOCAL_PATH}/virtualenv sdks/python
[GitHub] incubator-beam pull request #603: Modified addBulkOptions for simplicity
Github user asfgit closed the pull request at: https://github.com/apache/incubator-beam/pull/603 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[2/2] incubator-beam git commit: Modified addBulkOptions for simplicity
Modified addBulkOptions for simplicity Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/b9231826 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/b9231826 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/b9231826 Branch: refs/heads/master Commit: b9231826bb9c8084f3802206fbdd1d9f69fea3a6 Parents: 155409b Author: Ian ZhouAuthored: Thu Jul 7 10:27:47 2016 -0700 Committer: Dan Halperin Committed: Thu Jul 7 21:53:35 2016 -0700 -- .../beam/sdk/io/gcp/bigtable/BigtableIO.java| 95 +-- .../sdk/io/gcp/bigtable/BigtableIOTest.java | 99 ++-- 2 files changed, 99 insertions(+), 95 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/b9231826/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java -- diff --git a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java index dd17abe..b4c3c75 100644 --- a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java +++ b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java @@ -47,7 +47,6 @@ import com.google.bigtable.v1.Row; import com.google.bigtable.v1.RowFilter; import com.google.bigtable.v1.SampleRowKeysResponse; import com.google.cloud.bigtable.config.BigtableOptions; -import com.google.cloud.bigtable.config.BulkOptions; import com.google.cloud.bigtable.config.RetryOptions; import com.google.common.base.MoreObjects; import com.google.common.collect.ImmutableList; @@ -207,10 +206,23 @@ public class BigtableIO { public Read withBigtableOptions(BigtableOptions.Builder optionsBuilder) { checkNotNull(optionsBuilder, "optionsBuilder"); // TODO: is there a better way to clone a Builder? Want it to be immune from user changes. - BigtableOptions.Builder clonedBuilder = optionsBuilder.build().toBuilder(); - clonedBuilder.setDataChannelCount(1); - clonedBuilder = addRetryOptions(clonedBuilder); + BigtableOptions options = optionsBuilder.build(); + RetryOptions retryOptions = options.getRetryOptions(); + + // Set data channel count to one because there is only 1 scanner in this session + // Use retryOptionsToBuilder because absent in Bigtable library + // TODO: replace with RetryOptions.toBuilder() when added to Bigtable library + // Set batch size because of bug (incorrect initialization) in Bigtable library + // TODO: remove setRetryOptions when fixed in Bigtable library + BigtableOptions.Builder clonedBuilder = options.toBuilder() + .setDataChannelCount(1) + .setRetryOptions( + retryOptionsToBuilder(retryOptions) + .setStreamingBatchSize(Math.min(retryOptions.getStreamingBatchSize(), + retryOptions.getStreamingBufferSize() / 2)) + .build()); BigtableOptions optionsWithAgent = clonedBuilder.setUserAgent(getUserAgent()).build(); + return new Read(optionsWithAgent, tableId, filter, bigtableService); } @@ -393,9 +405,24 @@ public class BigtableIO { public Write withBigtableOptions(BigtableOptions.Builder optionsBuilder) { checkNotNull(optionsBuilder, "optionsBuilder"); // TODO: is there a better way to clone a Builder? Want it to be immune from user changes. - BigtableOptions.Builder clonedBuilder = optionsBuilder.build().toBuilder(); - clonedBuilder = addBulkOptions(clonedBuilder); - clonedBuilder = addRetryOptions(clonedBuilder); + BigtableOptions options = optionsBuilder.build(); + RetryOptions retryOptions = options.getRetryOptions(); + + // Set useBulkApi to true for enabling bulk writes + // Use retryOptionsToBuilder because absent in Bigtable library + // TODO: replace with RetryOptions.toBuilder() when added to Bigtable library + // Set batch size because of bug (incorrect initialization) in Bigtable library + // TODO: remove setRetryOptions when fixed in Bigtable library + BigtableOptions.Builder clonedBuilder = options.toBuilder() + .setBulkOptions( + options.getBulkOptions().toBuilder() + .setUseBulkApi(true) + .build()) + .setRetryOptions( + retryOptionsToBuilder(retryOptions) + .setStreamingBatchSize(Math.min(retryOptions.getStreamingBatchSize(), +
[1/2] incubator-beam git commit: Closes #603
Repository: incubator-beam Updated Branches: refs/heads/master 155409bf6 -> 290c0b772 Closes #603 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/290c0b77 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/290c0b77 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/290c0b77 Branch: refs/heads/master Commit: 290c0b77280b5fab3b656b3009bfdc897784c6b5 Parents: 155409b b923182 Author: Dan HalperinAuthored: Thu Jul 7 21:53:35 2016 -0700 Committer: Dan Halperin Committed: Thu Jul 7 21:53:35 2016 -0700 -- .../beam/sdk/io/gcp/bigtable/BigtableIO.java| 95 +-- .../sdk/io/gcp/bigtable/BigtableIOTest.java | 99 ++-- 2 files changed, 99 insertions(+), 95 deletions(-) --
[jira] [Resolved] (BEAM-289) Examples Use TypeDescriptors
[ https://issues.apache.org/jira/browse/BEAM-289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenneth Knowles resolved BEAM-289. -- Resolution: Fixed Fix Version/s: 0.1.0-incubating > Examples Use TypeDescriptors > > > Key: BEAM-289 > URL: https://issues.apache.org/jira/browse/BEAM-289 > Project: Beam > Issue Type: Improvement > Components: examples-java >Reporter: Jesse Anderson >Assignee: Frances Perry > Fix For: 0.1.0-incubating > > > Change the Java and Java 8 examples to use TypeDescriptors instead of inline > TypeDescriptor creation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-289) Examples Use TypeDescriptors
[ https://issues.apache.org/jira/browse/BEAM-289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenneth Knowles updated BEAM-289: - Fix Version/s: (was: 0.1.0-incubating) 0.2.0-incubating > Examples Use TypeDescriptors > > > Key: BEAM-289 > URL: https://issues.apache.org/jira/browse/BEAM-289 > Project: Beam > Issue Type: Improvement > Components: examples-java >Reporter: Jesse Anderson >Assignee: Frances Perry > Fix For: 0.2.0-incubating > > > Change the Java and Java 8 examples to use TypeDescriptors instead of inline > TypeDescriptor creation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-431) Examples dependencies on runners are a bit much and not enough
Kenneth Knowles created BEAM-431: Summary: Examples dependencies on runners are a bit much and not enough Key: BEAM-431 URL: https://issues.apache.org/jira/browse/BEAM-431 Project: Beam Issue Type: Bug Components: examples-java Reporter: Kenneth Knowles Assignee: Kenneth Knowles The Java 7 examples directly depend on the Dataflow runner as a compile dependency. This should just be fixed and removed. The Java 8 examples have optional runtime dependencies on the Dataflow and Flink runners. But even optional runtime dependencies must be resolved in a test scope, so it is not possible to exclude these from a hermetic testing environment - quite annoying. And the Spark runner should be included as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-429) minor: remove an obsolete comment in KafakIOTest.java
[ https://issues.apache.org/jira/browse/BEAM-429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366949#comment-15366949 ] ASF GitHub Bot commented on BEAM-429: - Github user asfgit closed the pull request at: https://github.com/apache/incubator-beam/pull/606 > minor: remove an obsolete comment in KafakIOTest.java > - > > Key: BEAM-429 > URL: https://issues.apache.org/jira/browse/BEAM-429 > Project: Beam > Issue Type: Improvement >Reporter: Raghu Angadi >Assignee: Raghu Angadi >Priority: Minor > > see https://github.com/apache/incubator-beam/pull/606 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[2/2] incubator-beam git commit: Closes #606
Closes #606 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/155409bf Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/155409bf Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/155409bf Branch: refs/heads/master Commit: 155409bf6041380c309e4452719763a683d41936 Parents: 1963bde 1c9d16d Author: Dan HalperinAuthored: Thu Jul 7 16:13:36 2016 -0700 Committer: Dan Halperin Committed: Thu Jul 7 16:13:36 2016 -0700 -- .../src/test/java/org/apache/beam/sdk/io/kafka/KafkaIOTest.java | 1 - 1 file changed, 1 deletion(-) --
[GitHub] incubator-beam pull request #606: [BEAM-429] remove an obsolete comment in K...
Github user asfgit closed the pull request at: https://github.com/apache/incubator-beam/pull/606 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[1/2] incubator-beam git commit: remove an obsolete comment in KafkaIOTest.java
Repository: incubator-beam Updated Branches: refs/heads/master 1963bde44 -> 155409bf6 remove an obsolete comment in KafkaIOTest.java Kafka 10.0 added compatible constructor for ConsumerRecord. It was not present when I wrote this test. https://github.com/apache/kafka/commit/4e557f8ef60d46a8870704655c9a35092f74d125 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/1c9d16d7 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/1c9d16d7 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/1c9d16d7 Branch: refs/heads/master Commit: 1c9d16d70ad2bd55b0736ea37f9357029f39ade5 Parents: 1963bde Author: Raghu AngadiAuthored: Thu Jul 7 14:22:41 2016 -0700 Committer: GitHub Committed: Thu Jul 7 14:22:41 2016 -0700 -- .../src/test/java/org/apache/beam/sdk/io/kafka/KafkaIOTest.java | 1 - 1 file changed, 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/1c9d16d7/sdks/java/io/kafka/src/test/java/org/apache/beam/sdk/io/kafka/KafkaIOTest.java -- diff --git a/sdks/java/io/kafka/src/test/java/org/apache/beam/sdk/io/kafka/KafkaIOTest.java b/sdks/java/io/kafka/src/test/java/org/apache/beam/sdk/io/kafka/KafkaIOTest.java index 587e3e2..dd93823 100644 --- a/sdks/java/io/kafka/src/test/java/org/apache/beam/sdk/io/kafka/KafkaIOTest.java +++ b/sdks/java/io/kafka/src/test/java/org/apache/beam/sdk/io/kafka/KafkaIOTest.java @@ -126,7 +126,6 @@ public class KafkaIOTest { records.put(tp, new ArrayList >()); } records.get(tp).add( - // Note: this interface has changed in 0.10. may get fixed before the release. new ConsumerRecord ( tp.topic(), tp.partition(),
[GitHub] incubator-beam pull request #607: Mark primitive display data tests Runnable...
GitHub user swegner opened a pull request: https://github.com/apache/incubator-beam/pull/607 Mark primitive display data tests RunnableOnService Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/swegner/incubator-beam displaydata-primitives Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/607.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #607 commit 0d30860483a2c8f3d3efde9a45a162e5b8ece476 Author: Scott WegnerDate: 2016-07-07T20:49:30Z Mark primitive display data tests RunnableOnService --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (BEAM-391) Exceptions in gcsio upload thread causes pipeline to stall
[ https://issues.apache.org/jira/browse/BEAM-391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Altay updated BEAM-391: - Summary: Exceptions in gcsio upload thread causes pipeline to stall (was: Invalid GCS bucket name causes pipeline to stall) > Exceptions in gcsio upload thread causes pipeline to stall > -- > > Key: BEAM-391 > URL: https://issues.apache.org/jira/browse/BEAM-391 > Project: Beam > Issue Type: Bug > Components: sdk-py >Reporter: Ahmet Altay > > gcsio got stuck with invalid bucket name > GcsBufferedWriter._start_upload (gcsio.py) raises an exception if the bucket > does not exist. This causes upload thread to silenty fail. It logs exception > to the log but this does not stop the pipeline or closes the receiving end of > the multiprocessing.Pipe(). Later a call in to write() blocks at > self.conn.send_bytes(). Note that send may block if the buffer is full. > Upload thread should have a finally clause to close the socket connection. Or > better propagating the exception to its parent. This is true for other types > of exceptions also. > Another small issue in the GcsBufferedWriter.close(). It does not self > self.close to True. > reproduction: python -m apache_beam.examples.wordcount --output > gs://no-such-thing/ > Prints the exception but goes on forever. Ctrl + C breaks the main thread > shows where it got stuck. > Similarly reproducible on the service. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-430) Introducing gcpTempLocation that default to tempLocation
[ https://issues.apache.org/jira/browse/BEAM-430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366882#comment-15366882 ] Pei He commented on BEAM-430: - Since stagingLocation will default to gcpTempLocation, and gcpTempLocation will default to tempLocation, DataflowRunner cannot use stagingLocation as the default value for tempLocation. We will break the dependency cycle between stagingLocation and tempLocation, which is currently in DataflowRunner. > Introducing gcpTempLocation that default to tempLocation > > > Key: BEAM-430 > URL: https://issues.apache.org/jira/browse/BEAM-430 > Project: Beam > Issue Type: Improvement >Reporter: Pei He >Assignee: Pei He >Priority: Minor > > Currently, DataflowPipelineOptions.stagingLocation default to tempLocation. > And, it requires tempLocation to be a gcs path. > Another case is BigQueryIO uses tempLocation and also requires it to be on > gcs. > So, users cannot set tempLocation to a non-gcs path with DataflowRunner or > BigQueryIO. > However, tempLocation could be on any file system. For example, WordCount > defaults to output to tempLocation. > The proposal is to add gcpTempLocation. And, it defaults to tempLocation if > tempLocation is a gcs path. > StagingLocation and BigQueryIO will use gcpTempLocation by default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-429) minor: remove an obsolete comment in KafakIOTest.java
Raghu Angadi created BEAM-429: - Summary: minor: remove an obsolete comment in KafakIOTest.java Key: BEAM-429 URL: https://issues.apache.org/jira/browse/BEAM-429 Project: Beam Issue Type: Improvement Reporter: Raghu Angadi Assignee: Raghu Angadi Priority: Minor see https://github.com/apache/incubator-beam/pull/606 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-beam pull request #606: remove an obsolete comment in KafkaIOTest....
GitHub user rangadi opened a pull request: https://github.com/apache/incubator-beam/pull/606 remove an obsolete comment in KafkaIOTest.java Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- Kafka 10.0 added compatible constructor for ConsumerRecord. It was not present when I wrote this test. https://github.com/apache/kafka/commit/4e557f8ef60d46a8870704655c9a35092f74d125 You can merge this pull request into a Git repository by running: $ git pull https://github.com/rangadi/incubator-beam patch-1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/606.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #606 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-beam pull request #604: Add support for ZLIB and DEFLATE compressi...
Github user asfgit closed the pull request at: https://github.com/apache/incubator-beam/pull/604 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[1/2] incubator-beam git commit: Add support for ZLIB and DEFLATE compression
Repository: incubator-beam Updated Branches: refs/heads/python-sdk 9bc04b750 -> a580b31ac Add support for ZLIB and DEFLATE compression Code originally contributed by Slaven Bilac. Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/e1b3ac30 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/e1b3ac30 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/e1b3ac30 Branch: refs/heads/python-sdk Commit: e1b3ac30b5dd5b14058247fad73b6b235e618094 Parents: 9bc04b7 Author: Robert BradshawAuthored: Thu Jul 7 13:40:18 2016 -0700 Committer: Robert Bradshaw Committed: Thu Jul 7 13:40:18 2016 -0700 -- sdks/python/apache_beam/io/fileio.py | 168 + sdks/python/apache_beam/io/fileio_test.py | 30 - 2 files changed, 173 insertions(+), 25 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/e1b3ac30/sdks/python/apache_beam/io/fileio.py -- diff --git a/sdks/python/apache_beam/io/fileio.py b/sdks/python/apache_beam/io/fileio.py index 6475a34..31b6a93 100644 --- a/sdks/python/apache_beam/io/fileio.py +++ b/sdks/python/apache_beam/io/fileio.py @@ -20,7 +20,6 @@ from __future__ import absolute_import import glob -import gzip import logging from multiprocessing.pool import ThreadPool import os @@ -28,6 +27,7 @@ import re import shutil import tempfile import time +import zlib from apache_beam import coders from apache_beam.io import iobase @@ -269,13 +269,129 @@ class _CompressionType(object): class CompressionTypes(object): """Enum-like class representing known compression types.""" NO_COMPRESSION = _CompressionType(1) # No compression. - DEFLATE = _CompressionType(2) # 'Deflate' ie gzip compression. + DEFLATE = _CompressionType(2) # 'Deflate' compression (without headers). + GZIP = _CompressionType(3) # gzip compression (deflate with gzip headers). + ZLIB = _CompressionType(4) # zlib compression (deflate with zlib headers). @staticmethod - def valid_compression_type(compression_type): + def is_valid_compression_type(compression_type): """Returns true for valid compression types, false otherwise.""" return isinstance(compression_type, _CompressionType) + @staticmethod + def mime_type(compression_type, default='application/octet-stream'): +if compression_type == CompressionTypes.GZIP: + return 'application/x-gzip' +elif compression_type == CompressionTypes.ZLIB: + return 'application/octet-stream' +elif compression_type == CompressionTypes.DEFLATE: + return 'application/octet-stream' +else: + return default + + +class _CompressedFile(object): + """Somewhat limited file wrapper for easier handling of compressed files.""" + _type_mask = { + CompressionTypes.ZLIB: zlib.MAX_WBITS, + CompressionTypes.GZIP: zlib.MAX_WBITS | 16, + CompressionTypes.DEFLATE: -zlib.MAX_WBITS, + } + + def __init__(self, + fileobj=None, + compression_type=CompressionTypes.ZLIB, + read_size=16384): +self._validate_compression_type(compression_type) +if not fileobj: + raise ValueError('fileobj must be opened file but was %s' % fileobj) + +self.fileobj = fileobj +self.data = '' +self.read_size = read_size +self.compression_type = compression_type +if self._readable(): + self.decompressor = self._create_decompressor(self.compression_type) +else: + self.decompressor = None +if self._writeable(): + self.compressor = self._create_compressor(self.compression_type) +else: + self.compressor = None + + def _validate_compression_type(self, compression_type): +if not CompressionTypes.is_valid_compression_type(compression_type): + raise TypeError('compression_type must be CompressionType object but ' + 'was %s' % type(compression_type)) +if compression_type == CompressionTypes.NO_COMPRESSION: + raise ValueError('cannot create object with no compression') + + def _create_compressor(self, compression_type): +self._validate_compression_type(compression_type) +return zlib.compressobj(9, zlib.DEFLATED, +self._type_mask[compression_type]) + + def _create_decompressor(self, compression_type): +self._validate_compression_type(compression_type) +return zlib.decompressobj(self._type_mask[compression_type]) + + def _readable(self): +mode = self.fileobj.mode +return 'r' in mode or 'a' in mode + + def _writeable(self): +mode = self.fileobj.mode +return 'w' in mode or 'a' in mode + + def write(self, data): +"""Write data to file.""" +if not
[2/2] incubator-beam git commit: Closes #604
Closes #604 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/a580b31a Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/a580b31a Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/a580b31a Branch: refs/heads/python-sdk Commit: a580b31ac5480318e632c3d8e33ea0bcfe5177d3 Parents: 9bc04b7 e1b3ac3 Author: Robert BradshawAuthored: Thu Jul 7 14:22:22 2016 -0700 Committer: Robert Bradshaw Committed: Thu Jul 7 14:22:22 2016 -0700 -- sdks/python/apache_beam/io/fileio.py | 168 + sdks/python/apache_beam/io/fileio_test.py | 30 - 2 files changed, 173 insertions(+), 25 deletions(-) --
[jira] [Commented] (BEAM-320) Provide Beam keyturn binary distributions embedding runners and execution runtime
[ https://issues.apache.org/jira/browse/BEAM-320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366779#comment-15366779 ] Vlad Rozov commented on BEAM-320: - Apache Beam can do binary release that includes necessary binaries for runners, and publish it to the Apache Beam wiki, svn repository along with the source release, maven repository or other similar resources. I suspect that runner providers such as Apache Flink, Apache Apex or Apache Spark will provide their binary distributions as well. Such binary distributions are different from the official Apache source release and if somebody uses Apache Beam source release to build binaries, she will need to download or build dependent run-time libraries. > Provide Beam keyturn binary distributions embedding runners and execution > runtime > - > > Key: BEAM-320 > URL: https://issues.apache.org/jira/browse/BEAM-320 > Project: Beam > Issue Type: Wish > Components: build-system >Reporter: Jean-Baptiste Onofré >Assignee: Jean-Baptiste Onofré > > Now, the only distribution Beam provides is the source distribution. > For new users, it could be interesting to have ready-to-use binary > distribution embedding the SDK, a specific runner with the backend execution > runtime. > For instance, we could provide: > - beam-spark-xxx.tar.gz containing SDK, Spark runner, Spark > - beam-flink-xxx.tar.gz containing SDK, Flink runner, Flink > Thoughts ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-beam pull request #605: Rename DataflowExampleUtils and DataflowEx...
GitHub user peihe opened a pull request: https://github.com/apache/incubator-beam/pull/605 Rename DataflowExampleUtils and DataflowExampleOptions You can merge this pull request into a Git repository by running: $ git pull https://github.com/peihe/incubator-beam renaming Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/605.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #605 commit e1af028eed98e3b06b926f947494d46df0d9df36 Author: Pei HeDate: 2016-07-07T20:45:24Z Rename DataflowExampleUtils and DataflowExampleOptions --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (BEAM-320) Provide Beam keyturn binary distributions embedding runners and execution runtime
[ https://issues.apache.org/jira/browse/BEAM-320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366734#comment-15366734 ] Amit Sela commented on BEAM-320: What about execution runtime source packaging ? On one hand it would be easier for the user to have the runner dependency "bring-along" the execution runtime as well (like the flink-runner does), on the other hand I'm not an Apache Spark PMC or even a Committer, and I'm not sure if doing such a thing would be OK. Sounds like the people who own this code have to make this call. BTW [~jbonofre] how does it usually work ? How does it work with Hadoop across all projects using it ? [~vrozov] mentioned that it's against Apache policy to include compiled binaries in the source release, how does this affect the choices we have ? Adding a link to the conversation thread: http://mail-archives.apache.org/mod_mbox/incubator-beam-dev/201607.mbox/%3ccabqdq5b8jf20d97e1xvxf+lpzc7sk3irabq33ruposwsuxg...@mail.gmail.com%3E > Provide Beam keyturn binary distributions embedding runners and execution > runtime > - > > Key: BEAM-320 > URL: https://issues.apache.org/jira/browse/BEAM-320 > Project: Beam > Issue Type: Wish > Components: build-system >Reporter: Jean-Baptiste Onofré >Assignee: Jean-Baptiste Onofré > > Now, the only distribution Beam provides is the source distribution. > For new users, it could be interesting to have ready-to-use binary > distribution embedding the SDK, a specific runner with the backend execution > runtime. > For instance, we could provide: > - beam-spark-xxx.tar.gz containing SDK, Spark runner, Spark > - beam-flink-xxx.tar.gz containing SDK, Flink runner, Flink > Thoughts ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-beam pull request #604: Add support for ZLIB and DEFLATE compressi...
GitHub user robertwb opened a pull request: https://github.com/apache/incubator-beam/pull/604 Add support for ZLIB and DEFLATE compression Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- Code originally contributed by Slaven Bilac. You can merge this pull request into a Git repository by running: $ git pull https://github.com/robertwb/incubator-beam compress Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/604.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #604 commit e1b3ac30b5dd5b14058247fad73b6b235e618094 Author: Robert BradshawDate: 2016-07-07T20:40:18Z Add support for ZLIB and DEFLATE compression Code originally contributed by Slaven Bilac. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (BEAM-124) Testing -- End to End WordCount Batch and Streaming Tests
[ https://issues.apache.org/jira/browse/BEAM-124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366643#comment-15366643 ] Jason Kuster commented on BEAM-124: --- Update: Batch WordCount E2E has been in for a while. Verifiers are coming in a pull request soon, and then we'll look at expanding to a streaming test as well. > Testing -- End to End WordCount Batch and Streaming Tests > - > > Key: BEAM-124 > URL: https://issues.apache.org/jira/browse/BEAM-124 > Project: Beam > Issue Type: New Feature > Components: testing >Reporter: Steve Wheeler >Assignee: Jason Kuster > > Set up testing infrastructure so that an end to end test for WordCount (both > batch and streaming) will be run periodically. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (BEAM-124) Testing -- End to End WordCount Batch and Streaming Tests
[ https://issues.apache.org/jira/browse/BEAM-124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366643#comment-15366643 ] Jason Kuster edited comment on BEAM-124 at 7/7/16 7:28 PM: --- Update: Batch WordCount E2E has been in for a while. Verifiers are coming in a pull request soon, and then we'll look at expanding to a streaming test as well. Work is in-progress to get them running on all runners. was (Author: jasonkuster): Update: Batch WordCount E2E has been in for a while. Verifiers are coming in a pull request soon, and then we'll look at expanding to a streaming test as well. > Testing -- End to End WordCount Batch and Streaming Tests > - > > Key: BEAM-124 > URL: https://issues.apache.org/jira/browse/BEAM-124 > Project: Beam > Issue Type: New Feature > Components: testing >Reporter: Steve Wheeler >Assignee: Jason Kuster > > Set up testing infrastructure so that an end to end test for WordCount (both > batch and streaming) will be run periodically. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (BEAM-228) Create a merge bot for Beam
[ https://issues.apache.org/jira/browse/BEAM-228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Kuster closed BEAM-228. - Resolution: Won't Fix Fix Version/s: Not applicable Mergebot is being delivered to apache infra instead - closing as obsolete. > Create a merge bot for Beam > --- > > Key: BEAM-228 > URL: https://issues.apache.org/jira/browse/BEAM-228 > Project: Beam > Issue Type: New Feature > Components: project-management >Reporter: Jason Kuster >Assignee: Jason Kuster > Fix For: Not applicable > > > This issue tracks the creation of a merge bot for Beam. This merge bot should > watch the Beam github repository and queue and merge pull requests which are > marked LGTM and good for merge by an approved Beam committer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[3/4] incubator-beam git commit: Better error message for poor use of callable apply
Better error message for poor use of callable apply Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/c34f332a Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/c34f332a Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/c34f332a Branch: refs/heads/python-sdk Commit: c34f332a3ae6e4ad914965732f6a038a883a5b3b Parents: 31b3f00 Author: Robert BradshawAuthored: Wed Jul 6 15:42:55 2016 -0700 Committer: Robert Bradshaw Committed: Thu Jul 7 11:50:50 2016 -0700 -- sdks/python/apache_beam/pipeline.py | 4 sdks/python/apache_beam/pipeline_test.py | 13 - 2 files changed, 16 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/c34f332a/sdks/python/apache_beam/pipeline.py -- diff --git a/sdks/python/apache_beam/pipeline.py b/sdks/python/apache_beam/pipeline.py index a84cec3..012d4d9 100644 --- a/sdks/python/apache_beam/pipeline.py +++ b/sdks/python/apache_beam/pipeline.py @@ -47,6 +47,7 @@ import logging import os import shutil import tempfile +import types from apache_beam import pvalue from apache_beam import typehints @@ -196,6 +197,9 @@ class Pipeline(object): and needs to be cloned in order to apply again. """ if not isinstance(transform, ptransform.PTransform): + if isinstance(transform, (type, types.ClassType)): +raise TypeError("%s is not a PTransform instance, did you mean %s()?" +% (transform, transform.__name__)) transform = _CallableWrapperPTransform(transform) full_label = format_full_label(self._current_transform(), transform) http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/c34f332a/sdks/python/apache_beam/pipeline_test.py -- diff --git a/sdks/python/apache_beam/pipeline_test.py b/sdks/python/apache_beam/pipeline_test.py index 5e94087..8598737 100644 --- a/sdks/python/apache_beam/pipeline_test.py +++ b/sdks/python/apache_beam/pipeline_test.py @@ -32,6 +32,7 @@ from apache_beam.transforms import Create from apache_beam.transforms import FlatMap from apache_beam.transforms import Flatten from apache_beam.transforms import Map +from apache_beam.transforms import GroupByKey from apache_beam.transforms import PTransform from apache_beam.transforms import Read from apache_beam.transforms.util import assert_that, equal_to @@ -174,10 +175,20 @@ class PipelineTest(unittest.TestCase): def test_apply_custom_callable(self): pipeline = Pipeline(self.runner_name) pcoll = pipeline | Create('pcoll', [1, 2, 3]) -result = pipeline.apply(PipelineTest.custom_callable, pcoll) +result = pcoll | PipelineTest.custom_callable assert_that(result, equal_to([2, 3, 4])) pipeline.run() + def test_apply_custom_callable_error(self): +pipeline = Pipeline(self.runner_name) +pcoll = pipeline | Create('pcoll', [1, 2, 3]) +with self.assertRaises(TypeError) as cm: + pcoll | GroupByKey # Note the missing ()'s +self.assertEqual( +cm.exception.message, +" is not " +"a PTransform instance, did you mean GroupByKey()?") + def test_transform_no_super_init(self): class AddSuffix(PTransform):
[2/4] incubator-beam git commit: Cleanup dataflow_test.
Cleanup dataflow_test. Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/e0643774 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/e0643774 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/e0643774 Branch: refs/heads/python-sdk Commit: e06437746d0052f8e4af3edd5b19e4369038f826 Parents: 342d2d7 Author: Robert BradshawAuthored: Wed Jul 6 14:45:58 2016 -0700 Committer: Robert Bradshaw Committed: Thu Jul 7 11:50:49 2016 -0700 -- sdks/python/apache_beam/dataflow_test.py | 13 ++--- 1 file changed, 2 insertions(+), 11 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/e0643774/sdks/python/apache_beam/dataflow_test.py -- diff --git a/sdks/python/apache_beam/dataflow_test.py b/sdks/python/apache_beam/dataflow_test.py index d3721ee..c4933af 100644 --- a/sdks/python/apache_beam/dataflow_test.py +++ b/sdks/python/apache_beam/dataflow_test.py @@ -46,7 +46,7 @@ from apache_beam.transforms.window import WindowFn class DataflowTest(unittest.TestCase): """Dataflow integration tests.""" - SAMPLE_DATA = 'aa bb cc aa bb aa \n' * 10 + SAMPLE_DATA = ['aa bb cc aa bb aa \n'] * 10 SAMPLE_RESULT = [('cc', 10), ('bb', 20), ('aa', 30)] # TODO(silviuc): Figure out a nice way to specify labels for stages so that @@ -61,7 +61,7 @@ class DataflowTest(unittest.TestCase): def test_word_count(self): pipeline = Pipeline('DirectPipelineRunner') -lines = pipeline | Create('SomeWords', [DataflowTest.SAMPLE_DATA]) +lines = pipeline | Create('SomeWords', DataflowTest.SAMPLE_DATA) result = ( (lines | FlatMap('GetWords', lambda x: re.findall(r'\w+', x))) .apply('CountWords', DataflowTest.Count)) @@ -77,15 +77,6 @@ class DataflowTest(unittest.TestCase): assert_that(result, equal_to(['foo-A', 'foo-B', 'foo-C'])) pipeline.run() - def test_word_count_using_get(self): -pipeline = Pipeline('DirectPipelineRunner') -lines = pipeline | Create('SomeWords', [DataflowTest.SAMPLE_DATA]) -result = ( -(lines | FlatMap('GetWords', lambda x: re.findall(r'\w+', x))) -.apply('CountWords', DataflowTest.Count)) -assert_that(result, equal_to(DataflowTest.SAMPLE_RESULT)) -pipeline.run() - def test_par_do_with_side_input_as_arg(self): pipeline = Pipeline('DirectPipelineRunner') words_list = ['aa', 'bb', 'cc']
[4/4] incubator-beam git commit: Closes #595
Closes #595 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/9bc04b75 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/9bc04b75 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/9bc04b75 Branch: refs/heads/python-sdk Commit: 9bc04b7507b1e01688698ecc508468932042f5b8 Parents: 342d2d7 c34f332 Author: Robert BradshawAuthored: Thu Jul 7 11:50:50 2016 -0700 Committer: Robert Bradshaw Committed: Thu Jul 7 11:50:50 2016 -0700 -- sdks/python/apache_beam/dataflow_test.py| 13 ++-- .../examples/cookbook/custom_ptransform.py | 7 ++-- sdks/python/apache_beam/pipeline.py | 4 +++ sdks/python/apache_beam/pipeline_test.py| 13 +++- sdks/python/apache_beam/transforms/combiners.py | 34 +--- sdks/python/apache_beam/transforms/core.py | 4 +-- .../python/apache_beam/transforms/ptransform.py | 8 ++--- .../apache_beam/transforms/ptransform_test.py | 7 ++-- sdks/python/apache_beam/transforms/util.py | 8 ++--- 9 files changed, 48 insertions(+), 50 deletions(-) --
[1/4] incubator-beam git commit: Remove unneeded label argument in ptransform_fn
Repository: incubator-beam Updated Branches: refs/heads/python-sdk 342d2d798 -> 9bc04b750 Remove unneeded label argument in ptransform_fn The label is already in the fully qualified name due to nesting. Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/31b3f00c Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/31b3f00c Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/31b3f00c Branch: refs/heads/python-sdk Commit: 31b3f00cf709ee06fd6f3d38567404861c0ae244 Parents: e064377 Author: Robert BradshawAuthored: Wed Jul 6 15:15:51 2016 -0700 Committer: Robert Bradshaw Committed: Thu Jul 7 11:50:49 2016 -0700 -- .../examples/cookbook/custom_ptransform.py | 7 ++-- sdks/python/apache_beam/transforms/combiners.py | 34 +--- sdks/python/apache_beam/transforms/core.py | 4 +-- .../python/apache_beam/transforms/ptransform.py | 8 ++--- .../apache_beam/transforms/ptransform_test.py | 7 ++-- sdks/python/apache_beam/transforms/util.py | 8 ++--- 6 files changed, 30 insertions(+), 38 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/31b3f00c/sdks/python/apache_beam/examples/cookbook/custom_ptransform.py -- diff --git a/sdks/python/apache_beam/examples/cookbook/custom_ptransform.py b/sdks/python/apache_beam/examples/cookbook/custom_ptransform.py index f97545a..8da1f43 100644 --- a/sdks/python/apache_beam/examples/cookbook/custom_ptransform.py +++ b/sdks/python/apache_beam/examples/cookbook/custom_ptransform.py @@ -57,7 +57,7 @@ def run_count2(known_args, options): """Runs the second example pipeline.""" @beam.ptransform_fn - def Count(label, pcoll): # pylint: disable=invalid-name,unused-argument + def Count(pcoll): # pylint: disable=invalid-name """Count as a decorated function.""" return ( pcoll @@ -76,12 +76,11 @@ def run_count3(known_args, options): """Runs the third example pipeline.""" @beam.ptransform_fn - # pylint: disable=invalid-name,unused-argument - def Count(label, pcoll, factor=1): + # pylint: disable=invalid-name + def Count(pcoll, factor=1): """Count as a decorated function with a side input. Args: - label: optional label for this transform pcoll: the PCollection passed in from the previous transform factor: the amount by which to count http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/31b3f00c/sdks/python/apache_beam/transforms/combiners.py -- diff --git a/sdks/python/apache_beam/transforms/combiners.py b/sdks/python/apache_beam/transforms/combiners.py index e9f11a0..8c56e5a 100644 --- a/sdks/python/apache_beam/transforms/combiners.py +++ b/sdks/python/apache_beam/transforms/combiners.py @@ -148,7 +148,7 @@ class Top(object): # pylint: disable=no-self-argument @ptransform.ptransform_fn - def Of(label, pcoll, n, compare, *args, **kwargs): + def Of(pcoll, n, compare, *args, **kwargs): """Obtain a list of the compare-most N elements in a PCollection. This transform will retrieve the n greatest elements in the PCollection @@ -160,7 +160,6 @@ class Top(object): become additional arguments to the comparator. Args: - label: display label for transform processes. pcoll: PCollection to process. n: number of elements to extract from pcoll. compare: as described above. @@ -168,10 +167,10 @@ class Top(object): **kwargs: as described above. """ return pcoll | core.CombineGlobally( -label, TopCombineFn(n, compare), *args, **kwargs) +TopCombineFn(n, compare), *args, **kwargs) @ptransform.ptransform_fn - def PerKey(label, pcoll, n, compare, *args, **kwargs): + def PerKey(pcoll, n, compare, *args, **kwargs): """Identifies the compare-most N elements associated with each key. This transform will produce a PCollection mapping unique keys in the input @@ -184,7 +183,6 @@ class Top(object): become additional arguments to the comparator. Args: - label: display label for transform processes. pcoll: PCollection to process. n: number of elements to extract from pcoll. compare: as described above. @@ -196,27 +194,27 @@ class Top(object): compatible with KV[A, B]. """ return pcoll | core.CombinePerKey( -label, TopCombineFn(n, compare), *args, **kwargs) +TopCombineFn(n, compare), *args, **kwargs) @ptransform.ptransform_fn - def Largest(label, pcoll, n): + def Largest(pcoll, n): """Obtain a list of the greatest N elements in a PCollection.""" -
[2/2] incubator-beam git commit: pipeline.options should never be None
pipeline.options should never be None Also fix a typehints error this exposed. Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/87961e4b Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/87961e4b Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/87961e4b Branch: refs/heads/python-sdk Commit: 87961e4b5c489b1c973be3ce66cb66e1ab886228 Parents: 6b06e3e Author: Robert BradshawAuthored: Wed Jul 6 16:36:50 2016 -0700 Committer: Robert Bradshaw Committed: Thu Jul 7 11:47:16 2016 -0700 -- sdks/python/apache_beam/pipeline.py | 23 sdks/python/apache_beam/pvalue_test.py | 3 ++- sdks/python/apache_beam/transforms/core.py | 1 - sdks/python/apache_beam/typehints/typehints.py | 5 - .../apache_beam/typehints/typehints_test.py | 3 +++ 5 files changed, 18 insertions(+), 17 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/87961e4b/sdks/python/apache_beam/pipeline.py -- diff --git a/sdks/python/apache_beam/pipeline.py b/sdks/python/apache_beam/pipeline.py index ee83614..a84cec3 100644 --- a/sdks/python/apache_beam/pipeline.py +++ b/sdks/python/apache_beam/pipeline.py @@ -106,9 +106,9 @@ class Pipeline(object): raise ValueError( 'Parameter argv, if specified, must be a list. Received : %r', argv) else: - self.options = None + self.options = PipelineOptions([]) -if runner is None and self.options is not None: +if runner is None: runner = self.options.view_as(StandardOptions).runner if runner is None: runner = StandardOptions.DEFAULT_RUNNER @@ -122,11 +122,10 @@ class Pipeline(object): 'name of a registered runner.') # Validate pipeline options -if self.options is not None: - errors = PipelineOptionsValidator(self.options, runner).validate() - if errors: -raise ValueError( -'Pipeline has validations errors: \n' + '\n'.join(errors)) +errors = PipelineOptionsValidator(self.options, runner).validate() +if errors: + raise ValueError( + 'Pipeline has validations errors: \n' + '\n'.join(errors)) # Default runner to be used. self.runner = runner @@ -151,7 +150,7 @@ class Pipeline(object): def run(self): """Runs the pipeline. Returns whatever our runner returns after running.""" -if not self.options or self.options.view_as(SetupOptions).save_main_session: +if self.options.view_as(SetupOptions).save_main_session: # If this option is chosen, verify we can pickle the main session early. tmpdir = tempfile.mkdtemp() try: @@ -226,12 +225,8 @@ class Pipeline(object): self._current_transform().add_part(current) self.transforms_stack.append(current) -if self.options is not None: - type_options = self.options.view_as(TypeOptions) -else: - type_options = None - -if type_options is not None and type_options.pipeline_type_check: +type_options = self.options.view_as(TypeOptions) +if type_options.pipeline_type_check: transform.type_check_inputs(pvalueish) pvalueish_result = self.runner.apply(transform, pvalueish) http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/87961e4b/sdks/python/apache_beam/pvalue_test.py -- diff --git a/sdks/python/apache_beam/pvalue_test.py b/sdks/python/apache_beam/pvalue_test.py index ef7e5f5..bb742e0 100644 --- a/sdks/python/apache_beam/pvalue_test.py +++ b/sdks/python/apache_beam/pvalue_test.py @@ -47,6 +47,7 @@ class PValueTest(unittest.TestCase): pipeline = Pipeline('DirectPipelineRunner') value = pipeline | Create('create1', [1, 2, 3]) value2 = pipeline | Create('create2', [(1, 1), (2, 2), (3, 3)]) +value3 = pipeline | Create('create3', [(1, 1), (2, 2), (3, 3)]) self.assertEqual(AsSingleton(value), AsSingleton(value)) self.assertEqual(AsSingleton('new', value, default_value=1), AsSingleton('new', value, default_value=1)) @@ -59,7 +60,7 @@ class PValueTest(unittest.TestCase): self.assertNotEqual(AsSingleton(value), AsSingleton(value2)) self.assertNotEqual(AsIter(value), AsIter(value2)) self.assertNotEqual(AsList(value), AsList(value2)) -self.assertNotEqual(AsDict(value), AsDict(value2)) +self.assertNotEqual(AsDict(value2), AsDict(value3)) if __name__ == '__main__': http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/87961e4b/sdks/python/apache_beam/transforms/core.py -- diff
[GitHub] incubator-beam pull request #597: pipeline.options should never be None
Github user robertwb closed the pull request at: https://github.com/apache/incubator-beam/pull/597 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[1/2] incubator-beam git commit: Closes #597
Repository: incubator-beam Updated Branches: refs/heads/python-sdk 6b06e3e22 -> 342d2d798 Closes #597 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/342d2d79 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/342d2d79 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/342d2d79 Branch: refs/heads/python-sdk Commit: 342d2d79829ed9da9b8066a4ff0a4a02992188d6 Parents: 6b06e3e 87961e4 Author: Robert BradshawAuthored: Thu Jul 7 11:47:16 2016 -0700 Committer: Robert Bradshaw Committed: Thu Jul 7 11:47:16 2016 -0700 -- sdks/python/apache_beam/pipeline.py | 23 sdks/python/apache_beam/pvalue_test.py | 3 ++- sdks/python/apache_beam/transforms/core.py | 1 - sdks/python/apache_beam/typehints/typehints.py | 5 - .../apache_beam/typehints/typehints_test.py | 3 +++ 5 files changed, 18 insertions(+), 17 deletions(-) --
[GitHub] incubator-beam pull request #603: Modified addBulkOptions for simplicity
GitHub user ianzhou1 opened a pull request: https://github.com/apache/incubator-beam/pull/603 Modified addBulkOptions for simplicity You can merge this pull request into a Git repository by running: $ git pull https://github.com/ianzhou1/incubator-beam AddBulkOptions Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/603.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #603 commit e0d197638d24cadfa38afe39850ccae8861add67 Author: Ian ZhouDate: 2016-07-07T17:27:47Z Modified addBulkOptions for simplicity --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-beam pull request #593: Run lint on python/sdks folder to include ...
Github user aaltay closed the pull request at: https://github.com/apache/incubator-beam/pull/593 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[1/2] incubator-beam git commit: Enable linter rules no-self-argument, reimported, ungrouped-imports
Repository: incubator-beam Updated Branches: refs/heads/python-sdk 253497655 -> 6b06e3e22 Enable linter rules no-self-argument, reimported, ungrouped-imports And run lint on python/sdks folder to include setup.py. Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/2b14fe5d Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/2b14fe5d Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/2b14fe5d Branch: refs/heads/python-sdk Commit: 2b14fe5dd4d289599bc902a39392a7956a4d59c9 Parents: 2534976 Author: Ahmet AltayAuthored: Wed Jul 6 12:41:19 2016 -0700 Committer: Dan Halperin Committed: Thu Jul 7 09:49:36 2016 -0700 -- sdks/python/.pylintrc | 3 --- sdks/python/apache_beam/examples/snippets/snippets.py | 1 + sdks/python/apache_beam/examples/snippets/snippets_test.py | 7 --- sdks/python/apache_beam/transforms/combiners_test.py | 9 - sdks/python/apache_beam/transforms/trigger.py | 3 ++- sdks/python/run_pylint.sh | 2 +- sdks/python/setup.py | 5 ++--- 7 files changed, 14 insertions(+), 16 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/2b14fe5d/sdks/python/.pylintrc -- diff --git a/sdks/python/.pylintrc b/sdks/python/.pylintrc index c138f1a..c69fd2b 100644 --- a/sdks/python/.pylintrc +++ b/sdks/python/.pylintrc @@ -101,7 +101,6 @@ disable = multiple-statements, no-member, no-name-in-module, - no-self-argument, no-self-use, no-value-for-parameter, not-callable, @@ -112,14 +111,12 @@ disable = redefined-outer-name, redefined-variable-type, redundant-keyword-arg, - reimported, relative-import, similarities, simplifiable-if-statement, super-init-not-called, undefined-variable, unexpected-keyword-arg, - ungrouped-imports, unidiomatic-typecheck, unnecessary-lambda, unneeded-not, http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/2b14fe5d/sdks/python/apache_beam/examples/snippets/snippets.py -- diff --git a/sdks/python/apache_beam/examples/snippets/snippets.py b/sdks/python/apache_beam/examples/snippets/snippets.py index f5bbc66..d84deea 100644 --- a/sdks/python/apache_beam/examples/snippets/snippets.py +++ b/sdks/python/apache_beam/examples/snippets/snippets.py @@ -39,6 +39,7 @@ import apache_beam as beam # pylint:disable=invalid-name # pylint:disable=expression-not-assigned # pylint:disable=redefined-outer-name +# pylint:disable=reimported # pylint:disable=unused-variable # pylint:disable=wrong-import-order, wrong-import-position http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/2b14fe5d/sdks/python/apache_beam/examples/snippets/snippets_test.py -- diff --git a/sdks/python/apache_beam/examples/snippets/snippets_test.py b/sdks/python/apache_beam/examples/snippets/snippets_test.py index 87ce266..6e1045f 100644 --- a/sdks/python/apache_beam/examples/snippets/snippets_test.py +++ b/sdks/python/apache_beam/examples/snippets/snippets_test.py @@ -26,10 +26,11 @@ import apache_beam as beam from apache_beam import io from apache_beam import pvalue from apache_beam import typehints -from apache_beam.examples.snippets import snippets from apache_beam.io import fileio from apache_beam.utils.options import TypeOptions +from apache_beam.examples.snippets import snippets +# pylint: disable=expression-not-assigned # Monky-patch to use native sink for file path re-writing. io.TextFileSink = fileio.NativeTextFileSink @@ -226,6 +227,7 @@ class TypeHintsTest(unittest.TestCase): def test_bad_types(self): p = beam.Pipeline('DirectPipelineRunner', argv=sys.argv) +evens = None # pylint: disable=unused-variable # [START type_hints_missing_define_numbers] numbers = p | beam.Create(['1', '2', '3']) @@ -236,7 +238,7 @@ class TypeHintsTest(unittest.TestCase): evens = numbers | beam.Filter(lambda x: x % 2 == 0) # [END type_hints_missing_apply] -# Now suppose numers was defined as [snippet above]. +# Now suppose numbers was defined as [snippet above]. # When running this pipeline, you'd get a runtime error, # possibly on a remote machine, possibly very late. @@ -298,7 +300,6 @@ class TypeHintsTest(unittest.TestCase): # [END type_hints_runtime_on] def test_deterministic_key(self): -p = beam.Pipeline('DirectPipelineRunner', argv=sys.argv) lines = ['banana,fruit,3', 'kiwi,fruit,2', 'kiwi,fruit,2',
[2/2] incubator-beam git commit: Closes #593
Closes #593 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/6b06e3e2 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/6b06e3e2 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/6b06e3e2 Branch: refs/heads/python-sdk Commit: 6b06e3e220cb25b4aa4fe4ab3647e19d0c257f2a Parents: 2534976 2b14fe5 Author: Dan HalperinAuthored: Thu Jul 7 09:49:43 2016 -0700 Committer: Dan Halperin Committed: Thu Jul 7 09:49:43 2016 -0700 -- sdks/python/.pylintrc | 3 --- sdks/python/apache_beam/examples/snippets/snippets.py | 1 + sdks/python/apache_beam/examples/snippets/snippets_test.py | 7 --- sdks/python/apache_beam/transforms/combiners_test.py | 9 - sdks/python/apache_beam/transforms/trigger.py | 3 ++- sdks/python/run_pylint.sh | 2 +- sdks/python/setup.py | 5 ++--- 7 files changed, 14 insertions(+), 16 deletions(-) --
[GitHub] incubator-beam pull request #602: Fix BigtableIO display data label
Github user asfgit closed the pull request at: https://github.com/apache/incubator-beam/pull/602 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[2/2] incubator-beam git commit: Fix BigtableIO display data label
Fix BigtableIO display data label Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/9c91403c Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/9c91403c Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/9c91403c Branch: refs/heads/master Commit: 9c91403c7b0b7e45473b30be8019a8e07c66fd1c Parents: 33b18b1 Author: Scott WegnerAuthored: Thu Jul 7 08:38:14 2016 -0700 Committer: Dan Halperin Committed: Thu Jul 7 09:32:23 2016 -0700 -- .../transforms/display/DisplayDataMatchers.java | 22 .../sdk/transforms/display/DisplayDataTest.java | 13 ++-- .../beam/sdk/io/gcp/bigtable/BigtableIO.java| 2 +- .../sdk/io/gcp/bigtable/BigtableIOTest.java | 10 - 4 files changed, 34 insertions(+), 13 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/9c91403c/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/display/DisplayDataMatchers.java -- diff --git a/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/display/DisplayDataMatchers.java b/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/display/DisplayDataMatchers.java index 23cffd4..025a1f7 100644 --- a/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/display/DisplayDataMatchers.java +++ b/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/display/DisplayDataMatchers.java @@ -360,4 +360,26 @@ public class DisplayDataMatchers { } }; } + + /** + * Creates a matcher that matches if the examined {@link DisplayData.Item} has the specified + * label. + */ + public static Matcher hasLabel(String label) { +return hasLabel(Matchers.is(label)); + } + + /** + * Creates a matcher that matches if the examined {@link DisplayData.Item} has a label matching + * the specified label matcher. + */ + public static Matcher hasLabel(Matcher labelMatcher) { +return new FeatureMatcher ( +labelMatcher, "display item with label", "label") { + @Override + protected String featureValueOf(DisplayData.Item actual) { +return actual.getLabel(); + } +}; + } } http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/9c91403c/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/display/DisplayDataTest.java -- diff --git a/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/display/DisplayDataTest.java b/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/display/DisplayDataTest.java index a1189bb..cafe873 100644 --- a/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/display/DisplayDataTest.java +++ b/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/display/DisplayDataTest.java @@ -19,6 +19,7 @@ package org.apache.beam.sdk.transforms.display; import static org.apache.beam.sdk.transforms.display.DisplayDataMatchers.hasDisplayItem; import static org.apache.beam.sdk.transforms.display.DisplayDataMatchers.hasKey; +import static org.apache.beam.sdk.transforms.display.DisplayDataMatchers.hasLabel; import static org.apache.beam.sdk.transforms.display.DisplayDataMatchers.hasNamespace; import static org.apache.beam.sdk.transforms.display.DisplayDataMatchers.hasType; import static org.apache.beam.sdk.transforms.display.DisplayDataMatchers.hasValue; @@ -206,7 +207,7 @@ public class DisplayDataTest implements Serializable { hasType(DisplayData.Type.TIMESTAMP), hasValue(ISO_FORMATTER.print(value)), hasShortValue(nullValue(String.class)), -hasLabel(is("the current instant")), +hasLabel("the current instant"), hasUrl(is("http://time.gov;))); assertThat(item, matchesAllOf); @@ -1104,16 +1105,6 @@ public class DisplayDataTest implements Serializable { return hasItem(jsonNode); } - private static Matcher hasLabel(Matcher labelMatcher) { -return new FeatureMatcher ( -labelMatcher, "display item with label", "label") { - @Override - protected String featureValueOf(DisplayData.Item actual) { -return actual.getLabel(); - } -}; - } - private static Matcher hasUrl(Matcher urlMatcher) { return new FeatureMatcher ( urlMatcher, "display item with url", "URL") { http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/9c91403c/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
[1/2] incubator-beam git commit: Closes #602
Repository: incubator-beam Updated Branches: refs/heads/master 33b18b173 -> 1963bde44 Closes #602 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/1963bde4 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/1963bde4 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/1963bde4 Branch: refs/heads/master Commit: 1963bde446a798f0971393bb8f0076fc933367cf Parents: 33b18b1 9c91403 Author: Dan HalperinAuthored: Thu Jul 7 09:32:23 2016 -0700 Committer: Dan Halperin Committed: Thu Jul 7 09:32:23 2016 -0700 -- .../transforms/display/DisplayDataMatchers.java | 22 .../sdk/transforms/display/DisplayDataTest.java | 13 ++-- .../beam/sdk/io/gcp/bigtable/BigtableIO.java| 2 +- .../sdk/io/gcp/bigtable/BigtableIOTest.java | 10 - 4 files changed, 34 insertions(+), 13 deletions(-) --
[jira] [Commented] (BEAM-403) Support staging SDK packages from PyPI for remote execution
[ https://issues.apache.org/jira/browse/BEAM-403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366324#comment-15366324 ] ASF GitHub Bot commented on BEAM-403: - Github user silviulica closed the pull request at: https://github.com/apache/incubator-beam/pull/569 > Support staging SDK packages from PyPI for remote execution > --- > > Key: BEAM-403 > URL: https://issues.apache.org/jira/browse/BEAM-403 > Project: Beam > Issue Type: Bug > Components: runner-dataflow, sdk-py >Reporter: Silviu Calinoiu >Assignee: Silviu Calinoiu >Priority: Minor > Fix For: 0.2.0-incubating > > > Currently the dataflow runner will pickup the SDK tarball from the old github > repo and stage it. We need to pick it up from PyPI (where packages will be > released) and stage it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-beam pull request #569: [BEAM-403] Get current SDK package from Py...
Github user silviulica closed the pull request at: https://github.com/apache/incubator-beam/pull/569 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-beam pull request #602: Fix BigtableIO display data label
GitHub user swegner opened a pull request: https://github.com/apache/incubator-beam/pull/602 Fix BigtableIO display data label Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/swegner/incubator-beam bigtable-displaydata Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/602.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #602 commit 064e4c00ad410b041098a918ceb3983d668f1c7a Author: Scott WegnerDate: 2016-07-07T15:38:14Z Fix BigtableIO display data label --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-beam pull request #601: Documentation URL provided previously thro...
Github user asfgit closed the pull request at: https://github.com/apache/incubator-beam/pull/601 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[2/2] incubator-beam git commit: Closes #601
Closes #601 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/33b18b17 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/33b18b17 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/33b18b17 Branch: refs/heads/master Commit: 33b18b173c8132435f6051827cdc1355693205ff Parents: 66d726a b90f71a Author: Dan HalperinAuthored: Thu Jul 7 08:35:17 2016 -0700 Committer: Dan Halperin Committed: Thu Jul 7 08:35:17 2016 -0700 -- README.md | 8 1 file changed, 4 insertions(+), 4 deletions(-) --
[jira] [Resolved] (BEAM-408) ProxyInvocationHandler uses inefficient Math.random() for random int
[ https://issues.apache.org/jira/browse/BEAM-408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik resolved BEAM-408. Resolution: Fixed Fix Version/s: 0.2.0-incubating > ProxyInvocationHandler uses inefficient Math.random() for random int > > > Key: BEAM-408 > URL: https://issues.apache.org/jira/browse/BEAM-408 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Lucas Amorim >Priority: Minor > Labels: findbugs, newbie, starter > Fix For: 0.2.0-incubating > > > [FindBugs > DM_NEXTINT_VIA_NEXTDOUBLE|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml#L209]: > Use the nextInt method of Random rather than nextDouble to generate a random > integer > Applies to: > [ProxyInvocationHandler.hashCode|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java#L96]. > This random value is used as a pseudo-unique value for #hashCode(), although > the default Object.hasCode() implementation does the same thing, so this > should be unnecessary. > This is a good starter bug. When fixing, please remove the corresponding > entries from > [findbugs-filter.xml|https://github.com/apache/incubator-beam/blob/master/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml] > and verify the build passes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-408) ProxyInvocationHandler uses inefficient Math.random() for random int
[ https://issues.apache.org/jira/browse/BEAM-408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366127#comment-15366127 ] ASF GitHub Bot commented on BEAM-408: - Github user asfgit closed the pull request at: https://github.com/apache/incubator-beam/pull/588 > ProxyInvocationHandler uses inefficient Math.random() for random int > > > Key: BEAM-408 > URL: https://issues.apache.org/jira/browse/BEAM-408 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Lucas Amorim >Priority: Minor > Labels: findbugs, newbie, starter > > [FindBugs > DM_NEXTINT_VIA_NEXTDOUBLE|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml#L209]: > Use the nextInt method of Random rather than nextDouble to generate a random > integer > Applies to: > [ProxyInvocationHandler.hashCode|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java#L96]. > This random value is used as a pseudo-unique value for #hashCode(), although > the default Object.hasCode() implementation does the same thing, so this > should be unnecessary. > This is a good starter bug. When fixing, please remove the corresponding > entries from > [findbugs-filter.xml|https://github.com/apache/incubator-beam/blob/master/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml] > and verify the build passes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[1/4] incubator-beam git commit: Removes extra blank space
Repository: incubator-beam Updated Branches: refs/heads/master 1cb898f58 -> 66d726aa2 Removes extra blank space Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/7e97cbcc Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/7e97cbcc Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/7e97cbcc Branch: refs/heads/master Commit: 7e97cbcc8a6ec47bcfad17054e581a45dc248ac2 Parents: 9df033c Author: Lucas AmorimAuthored: Wed Jul 6 23:11:55 2016 -0700 Committer: Luke Cwik Committed: Thu Jul 7 09:07:15 2016 -0400 -- .../org/apache/beam/sdk/options/ProxyInvocationHandler.java| 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/7e97cbcc/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java -- diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java b/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java index cb69979..fe67f16 100644 --- a/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java +++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java @@ -91,11 +91,9 @@ import javax.annotation.concurrent.ThreadSafe; class ProxyInvocationHandler implements InvocationHandler, HasDisplayData { private static final ObjectMapper MAPPER = new ObjectMapper(); /** - * No two instances of this class are considered equivalent hence we generate a random hash code - * between 0 and {@link Integer#MAX_VALUE}. + * No two instances of this class are considered equivalent hence we generate a random hash code. */ - private final int hashCode = ThreadLocalRandom.current().nextInt( - Integer.MIN_VALUE, Integer.MAX_VALUE); + private final int hashCode = ThreadLocalRandom.current().nextInt(); private final Set knownInterfaces; private final ClassToInstanceMap interfaceToProxyCache; private final Map options;
[GitHub] incubator-beam pull request #588: [BEAM-408] - Fixes ProxyInvocationHandler ...
Github user asfgit closed the pull request at: https://github.com/apache/incubator-beam/pull/588 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[4/4] incubator-beam git commit: [BEAM-408] - Fixes ProxyInvocationHandler uses inefficient Math.random() for random int
[BEAM-408] - Fixes ProxyInvocationHandler uses inefficient Math.random() for random int This closes #588 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/66d726aa Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/66d726aa Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/66d726aa Branch: refs/heads/master Commit: 66d726aa2aa56f11f2ac91771941d84e64fb4e4c Parents: 1cb898f 7e97cbc Author: Luke CwikAuthored: Thu Jul 7 09:07:30 2016 -0400 Committer: Luke Cwik Committed: Thu Jul 7 09:07:30 2016 -0400 -- .../build-tools/src/main/resources/beam/findbugs-filter.xml| 6 -- .../org/apache/beam/sdk/options/ProxyInvocationHandler.java| 6 +++--- 2 files changed, 3 insertions(+), 9 deletions(-) --
[3/4] incubator-beam git commit: [BEAM-408] - Fixes ProxyInvocationHandler uses inefficient Math.random() for random int
[BEAM-408] - Fixes ProxyInvocationHandler uses inefficient Math.random() for random int For more information: https://issues.apache.org/jira/browse/BEAM-408 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/708244bf Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/708244bf Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/708244bf Branch: refs/heads/master Commit: 708244bf91b4079690bbaed0c063b70e76a95794 Parents: 1cb898f Author: Lucas AmorimAuthored: Mon Jul 4 15:47:39 2016 -0700 Committer: Luke Cwik Committed: Thu Jul 7 09:07:15 2016 -0400 -- .../build-tools/src/main/resources/beam/findbugs-filter.xml| 6 -- .../org/apache/beam/sdk/options/ProxyInvocationHandler.java| 4 +++- 2 files changed, 3 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/708244bf/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml -- diff --git a/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml b/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml index a1f0e8a..d151315 100644 --- a/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml +++ b/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml @@ -194,12 +194,6 @@ - - - - - - http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/708244bf/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java -- diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java b/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java index e3d763b..0c10c2f 100644 --- a/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java +++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java @@ -66,6 +66,7 @@ import java.util.HashSet; import java.util.Iterator; import java.util.List; import java.util.Map; +import java.util.Random; import java.util.Set; import java.util.SortedMap; import java.util.TreeMap; @@ -93,7 +94,8 @@ class ProxyInvocationHandler implements InvocationHandler, HasDisplayData { * No two instances of this class are considered equivalent hence we generate a random hash code * between 0 and {@link Integer#MAX_VALUE}. */ - private final int hashCode = (int) (Math.random() * Integer.MAX_VALUE); + private final int hashCode = (RANDOM.nextInt() * Integer.MAX_VALUE); + private static final Random RANDOM = new Random(); private final Set knownInterfaces; private final ClassToInstanceMap interfaceToProxyCache; private final Map options;
[GitHub] incubator-beam pull request #585: Merge master into runners-spark2 branch to...
Github user amitsela closed the pull request at: https://github.com/apache/incubator-beam/pull/585 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (BEAM-381) OffsetBasedReader should construct sources before updating the range tracker
[ https://issues.apache.org/jira/browse/BEAM-381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365666#comment-15365666 ] ASF GitHub Bot commented on BEAM-381: - GitHub user dhalperi opened a pull request: https://github.com/apache/incubator-beam/pull/600 [BEAM-381] BoundedReader: update the range last of all Reorders the code in some splitAtFraction calls so that the rangeTracker update is the last thing (besides assignment) in the function. This avoids a potential issue if creating the primary or residual sources happens to throw an exception. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dhalperi/incubator-beam offsetbasedreader Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/600.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #600 commit 489432a3ce31aa3c769834920841c7a1df298720 Author: Dan HalperinDate: 2016-07-07T06:05:56Z [BEAM-381] BoundedReader: update the range last of all Reorders the code in some splitAtFraction calls so that the rangeTracker update is the last thing (besides assignment) in the function. This avoids a potential issue if creating the primary or residual sources happens to throw an exception. > OffsetBasedReader should construct sources before updating the range tracker > > > Key: BEAM-381 > URL: https://issues.apache.org/jira/browse/BEAM-381 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 0.1.0-incubating, 0.2.0-incubating >Reporter: Daniel Halperin >Assignee: Daniel Halperin > > OffsetBasedReader has the following code: > {code} > if (!rangeTracker.trySplitAtPosition(splitOffset)) { > return null; > } > long start = source.getStartOffset(); > long end = source.getEndOffset(); > OffsetBasedSource primary = source.createSourceForSubrange(start, > splitOffset); > OffsetBasedSource residual = > source.createSourceForSubrange(splitOffset, end); > this.source = primary; > return residual; > {code} > The first line is the line that updates the range of this source. However, > subsequent lines might throw (specifically, in > source.createSourceForSubrange). We should construct the sources first, and > then catch exceptions and return null if they fail. This way, the > splitAtFraction call will not throw (so work is not wasted) and the range > tracker will not be updated if either the primary or (more likely) the > residual could not be created. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-beam pull request #600: [BEAM-381] BoundedReader: update the range...
GitHub user dhalperi opened a pull request: https://github.com/apache/incubator-beam/pull/600 [BEAM-381] BoundedReader: update the range last of all Reorders the code in some splitAtFraction calls so that the rangeTracker update is the last thing (besides assignment) in the function. This avoids a potential issue if creating the primary or residual sources happens to throw an exception. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dhalperi/incubator-beam offsetbasedreader Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/600.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #600 commit 489432a3ce31aa3c769834920841c7a1df298720 Author: Dan HalperinDate: 2016-07-07T06:05:56Z [BEAM-381] BoundedReader: update the range last of all Reorders the code in some splitAtFraction calls so that the rangeTracker update is the last thing (besides assignment) in the function. This avoids a potential issue if creating the primary or residual sources happens to throw an exception. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-beam pull request #601: Documentation URL provided previously thro...
GitHub user SrinikhilReddy opened a pull request: https://github.com/apache/incubator-beam/pull/601 Documentation URL provided previously throws Error You can merge this pull request into a Git repository by running: $ git pull https://github.com/SrinikhilReddy/incubator-beam master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/601.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #601 commit 773f0fd1696627a0991682042a765813a9000af9 Author: srinikhilDate: 2016-07-07T06:06:54Z Change In Documentation URL --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---