Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Gearpump #165

2017-03-29 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-1829) MQTT message compression not working on Rapsberry Pi

2017-03-29 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948398#comment-15948398
 ] 

Jean-Baptiste Onofré commented on BEAM-1829:


By the way, the {{MqttIO}} doesn't explicitly define {{snappy}} dependency as 
it doesn't use it directly. So, {{snappy}} comes as a transitive dependency of 
the {{beam-sdks-java-core}}. So it means we have to update in the SDK and be 
careful on the side effects.

> MQTT message compression not working on Rapsberry Pi
> 
>
> Key: BEAM-1829
> URL: https://issues.apache.org/jira/browse/BEAM-1829
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Affects Versions: 0.6.0
>Reporter: Vassil Kolarov
>Assignee: Jean-Baptiste Onofré
>  Labels: MQTT, Snappy
>
> Most probably due to this bug: 
> https://github.com/xerial/snappy-java/issues/147, the following exception is 
> raised, when running on Raspberry Pi:
> Exception in thread "main" java.lang.UnsatisfiedLinkError: 
> /root/~/tmp/snappy-1.1.2-3c6134d1-26c5-4fb0-b6c9-669d4848d15b-libsnappyjava.so:
>  /root/~/tmp/snappy-1.1.2-3c6134d1-26c5-4fb0-b6c9-669d4848d15b-libsn
> appyjava.so: cannot open shared object file: No such file or directory
> at java.lang.ClassLoader$NativeLibrary.load(Native Method)
> at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1929)
> at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1814)
> at java.lang.Runtime.load0(Runtime.java:809)
> at java.lang.System.load(System.java:1083)
> at 
> org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:174)
> at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:152)
> at org.xerial.snappy.Snappy.(Snappy.java:46)
> at 
> org.xerial.snappy.SnappyOutputStream.(SnappyOutputStream.java:97)
> at 
> org.xerial.snappy.SnappyOutputStream.(SnappyOutputStream.java:89)
> at 
> org.xerial.snappy.SnappyOutputStream.(SnappyOutputStream.java:79)
> at 
> org.apache.beam.sdk.util.SerializableUtils.serializeToByteArray(SerializableUtils.java:48)
> at 
> org.apache.beam.sdk.util.SerializableUtils.ensureSerializable(SerializableUtils.java:83)
> at org.apache.beam.sdk.io.Read$Unbounded.(Read.java:141)
> at org.apache.beam.sdk.io.Read$Unbounded.(Read.java:136)
> at org.apache.beam.sdk.io.Read.from(Read.java:56)
> at org.apache.beam.sdk.io.mqtt.MqttIO$Read.expand(MqttIO.java:274)
> at org.apache.beam.sdk.io.mqtt.MqttIO$Read.expand(MqttIO.java:221)
> at 
> org.apache.beam.sdk.runners.PipelineRunner.apply(PipelineRunner.java:76)
> at 
> org.apache.beam.runners.direct.DirectRunner.apply(DirectRunner.java:296)
> at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:388)
> at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:302)
> at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:47)
> at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:152)
> at org.blah.beam.MqttPipeline.main(MqttPipeline.java:37)
> Increasing the snappy version to 1.1.4 will probably fix the issue.
> Best regards,
> Vassil



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1829) MQTT message compression not working on Rapsberry Pi

2017-03-29 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948391#comment-15948391
 ] 

Jean-Baptiste Onofré commented on BEAM-1829:


Happy to review !

> MQTT message compression not working on Rapsberry Pi
> 
>
> Key: BEAM-1829
> URL: https://issues.apache.org/jira/browse/BEAM-1829
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Affects Versions: 0.6.0
>Reporter: Vassil Kolarov
>Assignee: Jean-Baptiste Onofré
>  Labels: MQTT, Snappy
>
> Most probably due to this bug: 
> https://github.com/xerial/snappy-java/issues/147, the following exception is 
> raised, when running on Raspberry Pi:
> Exception in thread "main" java.lang.UnsatisfiedLinkError: 
> /root/~/tmp/snappy-1.1.2-3c6134d1-26c5-4fb0-b6c9-669d4848d15b-libsnappyjava.so:
>  /root/~/tmp/snappy-1.1.2-3c6134d1-26c5-4fb0-b6c9-669d4848d15b-libsn
> appyjava.so: cannot open shared object file: No such file or directory
> at java.lang.ClassLoader$NativeLibrary.load(Native Method)
> at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1929)
> at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1814)
> at java.lang.Runtime.load0(Runtime.java:809)
> at java.lang.System.load(System.java:1083)
> at 
> org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:174)
> at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:152)
> at org.xerial.snappy.Snappy.(Snappy.java:46)
> at 
> org.xerial.snappy.SnappyOutputStream.(SnappyOutputStream.java:97)
> at 
> org.xerial.snappy.SnappyOutputStream.(SnappyOutputStream.java:89)
> at 
> org.xerial.snappy.SnappyOutputStream.(SnappyOutputStream.java:79)
> at 
> org.apache.beam.sdk.util.SerializableUtils.serializeToByteArray(SerializableUtils.java:48)
> at 
> org.apache.beam.sdk.util.SerializableUtils.ensureSerializable(SerializableUtils.java:83)
> at org.apache.beam.sdk.io.Read$Unbounded.(Read.java:141)
> at org.apache.beam.sdk.io.Read$Unbounded.(Read.java:136)
> at org.apache.beam.sdk.io.Read.from(Read.java:56)
> at org.apache.beam.sdk.io.mqtt.MqttIO$Read.expand(MqttIO.java:274)
> at org.apache.beam.sdk.io.mqtt.MqttIO$Read.expand(MqttIO.java:221)
> at 
> org.apache.beam.sdk.runners.PipelineRunner.apply(PipelineRunner.java:76)
> at 
> org.apache.beam.runners.direct.DirectRunner.apply(DirectRunner.java:296)
> at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:388)
> at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:302)
> at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:47)
> at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:152)
> at org.blah.beam.MqttPipeline.main(MqttPipeline.java:37)
> Increasing the snappy version to 1.1.4 will probably fix the issue.
> Best regards,
> Vassil



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (BEAM-1829) MQTT message compression not working on Rapsberry Pi

2017-03-29 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Baptiste Onofré reassigned BEAM-1829:
--

Assignee: Jean-Baptiste Onofré  (was: Davor Bonaci)

> MQTT message compression not working on Rapsberry Pi
> 
>
> Key: BEAM-1829
> URL: https://issues.apache.org/jira/browse/BEAM-1829
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Affects Versions: 0.6.0
>Reporter: Vassil Kolarov
>Assignee: Jean-Baptiste Onofré
>  Labels: MQTT, Snappy
>
> Most probably due to this bug: 
> https://github.com/xerial/snappy-java/issues/147, the following exception is 
> raised, when running on Raspberry Pi:
> Exception in thread "main" java.lang.UnsatisfiedLinkError: 
> /root/~/tmp/snappy-1.1.2-3c6134d1-26c5-4fb0-b6c9-669d4848d15b-libsnappyjava.so:
>  /root/~/tmp/snappy-1.1.2-3c6134d1-26c5-4fb0-b6c9-669d4848d15b-libsn
> appyjava.so: cannot open shared object file: No such file or directory
> at java.lang.ClassLoader$NativeLibrary.load(Native Method)
> at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1929)
> at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1814)
> at java.lang.Runtime.load0(Runtime.java:809)
> at java.lang.System.load(System.java:1083)
> at 
> org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:174)
> at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:152)
> at org.xerial.snappy.Snappy.(Snappy.java:46)
> at 
> org.xerial.snappy.SnappyOutputStream.(SnappyOutputStream.java:97)
> at 
> org.xerial.snappy.SnappyOutputStream.(SnappyOutputStream.java:89)
> at 
> org.xerial.snappy.SnappyOutputStream.(SnappyOutputStream.java:79)
> at 
> org.apache.beam.sdk.util.SerializableUtils.serializeToByteArray(SerializableUtils.java:48)
> at 
> org.apache.beam.sdk.util.SerializableUtils.ensureSerializable(SerializableUtils.java:83)
> at org.apache.beam.sdk.io.Read$Unbounded.(Read.java:141)
> at org.apache.beam.sdk.io.Read$Unbounded.(Read.java:136)
> at org.apache.beam.sdk.io.Read.from(Read.java:56)
> at org.apache.beam.sdk.io.mqtt.MqttIO$Read.expand(MqttIO.java:274)
> at org.apache.beam.sdk.io.mqtt.MqttIO$Read.expand(MqttIO.java:221)
> at 
> org.apache.beam.sdk.runners.PipelineRunner.apply(PipelineRunner.java:76)
> at 
> org.apache.beam.runners.direct.DirectRunner.apply(DirectRunner.java:296)
> at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:388)
> at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:302)
> at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:47)
> at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:152)
> at org.blah.beam.MqttPipeline.main(MqttPipeline.java:37)
> Increasing the snappy version to 1.1.4 will probably fix the issue.
> Best regards,
> Vassil



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-972) Add basic level of unit testing to gearpump runner

2017-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948386#comment-15948386
 ] 

ASF GitHub Bot commented on BEAM-972:
-

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2302


> Add basic level of unit testing to gearpump runner
> --
>
> Key: BEAM-972
> URL: https://issues.apache.org/jira/browse/BEAM-972
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-gearpump, testing
>Reporter: Manu Zhang
>Assignee: Huafeng Wang
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2302: [BEAM-972] Add unit tests to Gearpump runner

2017-03-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2302


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: This closes #2302: Add unit tests to Gearpump runner

2017-03-29 Thread kenn
This closes #2302: Add unit tests to Gearpump runner


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/f4f23330
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/f4f23330
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/f4f23330

Branch: refs/heads/gearpump-runner
Commit: f4f233304836871884378c2a897ebcab07f08b2d
Parents: 555842a eb0d333
Author: Kenneth Knowles 
Authored: Wed Mar 29 21:03:08 2017 -0700
Committer: Kenneth Knowles 
Committed: Wed Mar 29 21:03:08 2017 -0700

--
 examples/java/pom.xml   | 12 +++
 pom.xml |  6 ++
 runners/gearpump/README.md  | 41 -
 runners/gearpump/pom.xml|  2 -
 .../gearpump/GearpumpRunnerRegistrar.java   |  4 +-
 .../translators/WindowAssignTranslator.java |  2 +-
 .../gearpump/translators/io/ValuesSource.java   |  2 -
 .../gearpump/GearpumpRunnerRegistrarTest.java   | 55 
 .../runners/gearpump/PipelineOptionsTest.java   | 73 
 .../translators/io/GearpumpSourceTest.java  | 90 
 .../gearpump/translators/io/ValueSoureTest.java | 82 ++
 11 files changed, 362 insertions(+), 7 deletions(-)
--




[1/2] beam git commit: [BEAM-972] Add unit tests to Gearpump runner

2017-03-29 Thread kenn
Repository: beam
Updated Branches:
  refs/heads/gearpump-runner 555842a1a -> f4f233304


[BEAM-972] Add unit tests to Gearpump runner


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/eb0d333d
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/eb0d333d
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/eb0d333d

Branch: refs/heads/gearpump-runner
Commit: eb0d333df23624f54aae2abb8d7c7873f8ed2a7a
Parents: 555842a
Author: huafengw 
Authored: Tue Mar 21 19:45:10 2017 +0800
Committer: huafengw 
Committed: Thu Mar 23 19:52:11 2017 +0800

--
 examples/java/pom.xml   | 12 +++
 pom.xml |  6 ++
 runners/gearpump/README.md  | 41 -
 runners/gearpump/pom.xml|  2 -
 .../gearpump/GearpumpRunnerRegistrar.java   |  4 +-
 .../translators/WindowAssignTranslator.java |  2 +-
 .../gearpump/translators/io/ValuesSource.java   |  2 -
 .../gearpump/GearpumpRunnerRegistrarTest.java   | 55 
 .../runners/gearpump/PipelineOptionsTest.java   | 73 
 .../translators/io/GearpumpSourceTest.java  | 90 
 .../gearpump/translators/io/ValueSoureTest.java | 82 ++
 11 files changed, 362 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/eb0d333d/examples/java/pom.xml
--
diff --git a/examples/java/pom.xml b/examples/java/pom.xml
index ed4a1d4..0a6d8fe 100644
--- a/examples/java/pom.xml
+++ b/examples/java/pom.xml
@@ -87,6 +87,18 @@
   
 
 
+
+
+  gearpump-runner
+  
+
+  org.apache.beam
+  beam-runners-gearpump
+  runtime
+
+  
+
+
 
 
   flink-runner

http://git-wip-us.apache.org/repos/asf/beam/blob/eb0d333d/pom.xml
--
diff --git a/pom.xml b/pom.xml
index c3b8476..2cdb09d 100644
--- a/pom.xml
+++ b/pom.xml
@@ -475,6 +475,12 @@
 
   
 org.apache.beam
+beam-runners-gearpump
+${project.version}
+  
+
+  
+org.apache.beam
 beam-examples-java
 ${project.version}
   

http://git-wip-us.apache.org/repos/asf/beam/blob/eb0d333d/runners/gearpump/README.md
--
diff --git a/runners/gearpump/README.md b/runners/gearpump/README.md
index ad043fa..e8ce794 100644
--- a/runners/gearpump/README.md
+++ b/runners/gearpump/README.md
@@ -19,4 +19,43 @@
 
 ## Gearpump Beam Runner
 
-The Gearpump Beam runner allows users to execute pipelines written using the 
Apache Beam programming API with Apache Gearpump (incubating) as an execution 
engine. 
\ No newline at end of file
+The Gearpump Beam runner allows users to execute pipelines written using the 
Apache Beam programming API with Apache Gearpump (incubating) as an execution 
engine.
+
+##Getting Started
+
+The following shows how to run the WordCount example that is provided with the 
source code on Beam.
+
+###Installing Beam
+
+To get the latest version of Beam with Gearpump-Runner, first clone the Beam 
repository:
+
+```
+git clone https://github.com/apache/beam
+git checkout gearpump-runner
+```
+
+Then switch to the newly created directory and run Maven to build the Apache 
Beam:
+
+```
+cd beam
+mvn clean install -DskipTests
+```
+
+Now Apache Beam and the Gearpump Runner are installed in your local Maven 
repository.
+
+###Running Wordcount Example
+
+Download something to count:
+
+```
+curl http://www.gutenberg.org/cache/epub/1128/pg1128.txt > /tmp/kinglear.txt
+```
+
+Run the pipeline, using the Gearpump runner:
+
+```
+cd examples/java
+mvn exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount 
-Dexec.args="--inputFile=/tmp/kinglear.txt --output=/tmp/wordcounts.txt 
--runner=TestGearpumpRunner" -Pgearpump-runner
+```
+
+Once completed, check the output file /tmp/wordcounts.txt-0-of-1

http://git-wip-us.apache.org/repos/asf/beam/blob/eb0d333d/runners/gearpump/pom.xml
--
diff --git a/runners/gearpump/pom.xml b/runners/gearpump/pom.xml
index 9a6a432..a691801 100644
--- a/runners/gearpump/pom.xml
+++ b/runners/gearpump/pom.xml
@@ -99,13 +99,11 @@
   org.apache.gearpump
   gearpump-streaming_2.11
   ${gearpump.version}
-  provided
 
 
   org.apache.gearpump
   gearpump-core_2.11
   ${gearpump.version}
-  provided
 
 
   com.typesafe


[jira] [Commented] (BEAM-1835) NPE in DirectRunner PubsubReader.ackBatch

2017-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948290#comment-15948290
 ] 

ASF GitHub Bot commented on BEAM-1835:
--

GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/2368

[BEAM-1835] Make Pubsub ackBatch safer

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
Checkpoints can be finalized at any point, including while the reader is
being closed or after it has been. When ackBatch is called and the
reader is closed, it should return. When close is called while a
different thread is using the client, it should wait for that thread to
complete before closing the reader.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam pubsub_npe

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2368.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2368


commit f00c850599ec864e5e1f2faea0d02b47d89a2aba
Author: Thomas Groh 
Date:   2017-03-30T02:26:06Z

Make Pubsub ackBatch safer

Checkpoints can be finalized at any point, including while the reader is
being closed or after it has been. When ackBatch is called and the
reader is closed, it should return. When close is called while a
different thread is using the client, it should wait for that thread to
complete before closing the reader.




> NPE in DirectRunner PubsubReader.ackBatch
> -
>
> Key: BEAM-1835
> URL: https://issues.apache.org/jira/browse/BEAM-1835
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct, sdk-java-core
>Reporter: Rafal Wojdyla
>Assignee: Rafal Wojdyla
>
> DirectRunner streaming mode throws null pointer exception:
> {noformat}
> org.apache.beam.sdk.io.PubsubUnboundedSource$PubsubReader.ackBatch(PubsubUnboundedSource.java:639)
>   at 
> org.apache.beam.sdk.io.PubsubUnboundedSource$PubsubCheckpoint.finalizeCheckpoint(PubsubUnboundedSource.java:312)
>   at 
> org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.finishRead(UnboundedReadEvaluatorFactory.java:223)
>   at 
> org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.processElement(UnboundedReadEvaluatorFactory.java:144)
>   at 
> org.apache.beam.runners.direct.TransformExecutor.processElements(TransformExecutor.java:139)
>   at 
> org.apache.beam.runners.direct.TransformExecutor.run(TransformExecutor.java:107)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This does not happen always, but for large enough number of events it's 
> pretty reproducible. The problems seems to be the concurrent reuse of a 
> reader among multiple threads, and a race condition, when one of the threads 
> "decided" to close the reader, based on:
> {code}
> private static final double DEFAULT_READER_REUSE_CHANCE = 0.95;
> {code}
> the close, nulls out pubsub client:
> {code}
> @Override
> public void close() throws IOException {
>   if (pubsubClient != null) {
> pubsubClient.close();
> pubsubClient = null;
>   }
> }
> {code}
> which if still in use by other thread will result in NPE above.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2368: [BEAM-1835] Make Pubsub ackBatch safer

2017-03-29 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/2368

[BEAM-1835] Make Pubsub ackBatch safer

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
Checkpoints can be finalized at any point, including while the reader is
being closed or after it has been. When ackBatch is called and the
reader is closed, it should return. When close is called while a
different thread is using the client, it should wait for that thread to
complete before closing the reader.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam pubsub_npe

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2368.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2368


commit f00c850599ec864e5e1f2faea0d02b47d89a2aba
Author: Thomas Groh 
Date:   2017-03-30T02:26:06Z

Make Pubsub ackBatch safer

Checkpoints can be finalized at any point, including while the reader is
being closed or after it has been. When ackBatch is called and the
reader is closed, it should return. When close is called while a
different thread is using the client, it should wait for that thread to
complete before closing the reader.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1835) NPE in DirectRunner PubsubReader.ackBatch

2017-03-29 Thread Thomas Groh (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948285#comment-15948285
 ] 

Thomas Groh commented on BEAM-1835:
---

This is actually a bug in PubsubUnboundedSource or PubsubCheckpoint. The 
{{finalizeCheckpoint}} documentation states that
"This finalize method may be called from any thread, concurrently with calls to 
{{UnboundedReader}} it was created from."
and
"It is not safe to assume the {{UnboundedReader}} from which this checkpoint 
was created still exists at the time this method is called."

The Runner is permitted to close the reader and finalize all outstanding 
checkpoints in whatever order, potentially interleaving the two operations, so 
the checkpoint mark must not assume that the client is still available, or that 
the client, if it is available and open when the call to finalizeCheckpoint 
begins, is still available and open at any future point.

> NPE in DirectRunner PubsubReader.ackBatch
> -
>
> Key: BEAM-1835
> URL: https://issues.apache.org/jira/browse/BEAM-1835
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct, sdk-java-core
>Reporter: Rafal Wojdyla
>Assignee: Rafal Wojdyla
>
> DirectRunner streaming mode throws null pointer exception:
> {noformat}
> org.apache.beam.sdk.io.PubsubUnboundedSource$PubsubReader.ackBatch(PubsubUnboundedSource.java:639)
>   at 
> org.apache.beam.sdk.io.PubsubUnboundedSource$PubsubCheckpoint.finalizeCheckpoint(PubsubUnboundedSource.java:312)
>   at 
> org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.finishRead(UnboundedReadEvaluatorFactory.java:223)
>   at 
> org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.processElement(UnboundedReadEvaluatorFactory.java:144)
>   at 
> org.apache.beam.runners.direct.TransformExecutor.processElements(TransformExecutor.java:139)
>   at 
> org.apache.beam.runners.direct.TransformExecutor.run(TransformExecutor.java:107)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This does not happen always, but for large enough number of events it's 
> pretty reproducible. The problems seems to be the concurrent reuse of a 
> reader among multiple threads, and a race condition, when one of the threads 
> "decided" to close the reader, based on:
> {code}
> private static final double DEFAULT_READER_REUSE_CHANCE = 0.95;
> {code}
> the close, nulls out pubsub client:
> {code}
> @Override
> public void close() throws IOException {
>   if (pubsubClient != null) {
> pubsubClient.close();
> pubsubClient = null;
>   }
> }
> {code}
> which if still in use by other thread will result in NPE above.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[1/2] beam git commit: Ensure that assert_that takes a PCollection as its first argument.

2017-03-29 Thread altay
Repository: beam
Updated Branches:
  refs/heads/master 4d633bc5a -> 70aa0a4a4


Ensure that assert_that takes a PCollection as its first argument.

This can avoid (silent) error such as

with test_pipeline.TestPipeline() as p:
  bad_value = "Create0" >> beam.Create([1])  # Note missing "p | "
  beam_util.assert_that(bad_value, beam_util.equal_to([0]))


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/4a963feb
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/4a963feb
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/4a963feb

Branch: refs/heads/master
Commit: 4a963feb8d12d81aedffc2933738f98e266ba916
Parents: 4d633bc
Author: Robert Bradshaw 
Authored: Thu Mar 9 16:18:46 2017 -0800
Committer: Ahmet Altay 
Committed: Wed Mar 29 17:53:24 2017 -0700

--
 sdks/python/apache_beam/transforms/util.py | 2 ++
 1 file changed, 2 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/4a963feb/sdks/python/apache_beam/transforms/util.py
--
diff --git a/sdks/python/apache_beam/transforms/util.py 
b/sdks/python/apache_beam/transforms/util.py
index ac7eb3c..a6ecf0a 100644
--- a/sdks/python/apache_beam/transforms/util.py
+++ b/sdks/python/apache_beam/transforms/util.py
@@ -20,6 +20,7 @@
 
 from __future__ import absolute_import
 
+from apache_beam import pvalue
 from apache_beam.transforms import window
 from apache_beam.transforms.core import CombinePerKey
 from apache_beam.transforms.core import Create
@@ -219,6 +220,7 @@ def assert_that(actual, matcher, label='assert_that'):
   Returns:
 Ignored.
   """
+  assert isinstance(actual, pvalue.PCollection)
 
   class AssertThat(PTransform):
 



[GitHub] beam pull request #2214: Ensure that assert_that takes a PCollection as its ...

2017-03-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2214


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1823) TimedOutException in postcommit

2017-03-29 Thread Mark Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948173#comment-15948173
 ] 

Mark Liu commented on BEAM-1823:


We set ValidatesRunner test timeout here: 
https://github.com/apache/beam/blob/master/sdks/python/run_postcommit.sh#L81

Currently it's 600s. So I think this is an special case that job exceeds time 
limit but succeeds in the end. From my observation, a ValidatesRunner generally 
takes about 4~5mins to finish. But we can increase the timeout considering 
bakend occasionally slow down.

Log is controlled by nose. Need further investigate to see if full console log 
can print in this case. Also having job_id printed in ValidatesRunner test 
(BEAM-1728) definitely helps for debugging.

> TimedOutException in postcommit
> ---
>
> Key: BEAM-1823
> URL: https://issues.apache.org/jira/browse/BEAM-1823
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Ahmet Altay
>Assignee: Mark Liu
>
> Mark, do you know what this error means? Where is the timeout configured.
> https://builds.apache.org/view/Beam/job/beam_PostCommit_Python_Verify/1657/console
> I _think_ this is one of the underlying Dataflow executions and it completed 
> (although much slower than usual):
> https://pantheon.corp.google.com/dataflow/job/2017-03-28_14_25_21-13472017589125356257?project=apache-beam-testing=433637338589
> It makes sense to time out the test but I want to know how it is configured. 
> Also, is it possible to print out output logs for failed/timed out tests so 
> that we can clearly associate tests with job executions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2366: Reopen BigQuery utils after #2271

2017-03-29 Thread ravwojdyla
GitHub user ravwojdyla opened a pull request:

https://github.com/apache/beam/pull/2366

Reopen BigQuery utils after #2271

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ravwojdyla/incubator-beam reopen_bq_utils

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2366.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2366


commit 0291729827c3fc88b913f161374473ede61cb6a0
Author: Rafal Wojdyla 
Date:   2017-03-30T00:14:12Z

Reopen BigQuery utils after #2271




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (BEAM-1836) Reopen BigQuery utils after #2271

2017-03-29 Thread Rafal Wojdyla (JIRA)
Rafal Wojdyla created BEAM-1836:
---

 Summary: Reopen BigQuery utils after #2271
 Key: BEAM-1836
 URL: https://issues.apache.org/jira/browse/BEAM-1836
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Reporter: Rafal Wojdyla
Assignee: Rafal Wojdyla


https://github.com/apache/beam/pull/2271 splits BigQueryIO, should not change 
any functionality but closes some of the useful BigQuery utils like:
 * {{toTableSpec}}
 * {{parseTableSpec}}

we in scio use those and would prefer not to reimplement them.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build is back to normal : beam_PostCommit_Python_Verify #1668

2017-03-29 Thread Apache Jenkins Server
See 




[jira] [Closed] (BEAM-1091) Python E2E WordCount test in Precommit

2017-03-29 Thread Mark Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu closed BEAM-1091.
--

> Python E2E WordCount test in Precommit
> --
>
> Key: BEAM-1091
> URL: https://issues.apache.org/jira/browse/BEAM-1091
> Project: Beam
>  Issue Type: Task
>  Components: sdk-py, testing
>Reporter: Mark Liu
>Assignee: Mark Liu
> Fix For: Not applicable
>
>
> We want to include some e2e test in precommit in order to catch bugs earlier, 
> instead of breaking postcommit very often.
> As what we have in postcommit, we want the same wordcount test in precommit 
> executed by DirectPipelineRunner and DataflowRunner. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-1687) Reduce Total Time of Running Python ValidatesRunner Tests in Postcommit

2017-03-29 Thread Mark Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu resolved BEAM-1687.

   Resolution: Done
Fix Version/s: Not applicable

> Reduce Total Time of Running Python ValidatesRunner Tests in Postcommit
> ---
>
> Key: BEAM-1687
> URL: https://issues.apache.org/jira/browse/BEAM-1687
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py, testing
>Reporter: Mark Liu
>Assignee: Mark Liu
> Fix For: Not applicable
>
>
> Takes 1h+ for 14 ValidatesRunner tests that running in Python postcommit, 
> which can be paralleled by taking advantage of Nose and Jenkins multi-core 
> machine.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-1091) Python E2E WordCount test in Precommit

2017-03-29 Thread Mark Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu resolved BEAM-1091.

   Resolution: Done
Fix Version/s: Not applicable

> Python E2E WordCount test in Precommit
> --
>
> Key: BEAM-1091
> URL: https://issues.apache.org/jira/browse/BEAM-1091
> Project: Beam
>  Issue Type: Task
>  Components: sdk-py, testing
>Reporter: Mark Liu
>Assignee: Mark Liu
> Fix For: Not applicable
>
>
> We want to include some e2e test in precommit in order to catch bugs earlier, 
> instead of breaking postcommit very often.
> As what we have in postcommit, we want the same wordcount test in precommit 
> executed by DirectPipelineRunner and DataflowRunner. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (BEAM-1687) Reduce Total Time of Running Python ValidatesRunner Tests in Postcommit

2017-03-29 Thread Mark Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu closed BEAM-1687.
--

> Reduce Total Time of Running Python ValidatesRunner Tests in Postcommit
> ---
>
> Key: BEAM-1687
> URL: https://issues.apache.org/jira/browse/BEAM-1687
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py, testing
>Reporter: Mark Liu
>Assignee: Mark Liu
> Fix For: Not applicable
>
>
> Takes 1h+ for 14 ValidatesRunner tests that running in Python postcommit, 
> which can be paralleled by taking advantage of Nose and Jenkins multi-core 
> machine.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: beam_PerformanceTests_Dataflow #246

2017-03-29 Thread Apache Jenkins Server
See 


Changes:

[tgroh] Update Dataflow Worker Image

[tgroh] Remove View indicator classes

[altay] Add first two mobile gaming examples to Python.

--
[...truncated 267.72 KB...]
 * [new ref] refs/pull/933/head -> origin/pr/933/head
 * [new ref] refs/pull/933/merge -> origin/pr/933/merge
 * [new ref] refs/pull/934/head -> origin/pr/934/head
 * [new ref] refs/pull/934/merge -> origin/pr/934/merge
 * [new ref] refs/pull/935/head -> origin/pr/935/head
 * [new ref] refs/pull/936/head -> origin/pr/936/head
 * [new ref] refs/pull/936/merge -> origin/pr/936/merge
 * [new ref] refs/pull/937/head -> origin/pr/937/head
 * [new ref] refs/pull/937/merge -> origin/pr/937/merge
 * [new ref] refs/pull/938/head -> origin/pr/938/head
 * [new ref] refs/pull/939/head -> origin/pr/939/head
 * [new ref] refs/pull/94/head -> origin/pr/94/head
 * [new ref] refs/pull/940/head -> origin/pr/940/head
 * [new ref] refs/pull/940/merge -> origin/pr/940/merge
 * [new ref] refs/pull/941/head -> origin/pr/941/head
 * [new ref] refs/pull/941/merge -> origin/pr/941/merge
 * [new ref] refs/pull/942/head -> origin/pr/942/head
 * [new ref] refs/pull/942/merge -> origin/pr/942/merge
 * [new ref] refs/pull/943/head -> origin/pr/943/head
 * [new ref] refs/pull/943/merge -> origin/pr/943/merge
 * [new ref] refs/pull/944/head -> origin/pr/944/head
 * [new ref] refs/pull/945/head -> origin/pr/945/head
 * [new ref] refs/pull/945/merge -> origin/pr/945/merge
 * [new ref] refs/pull/946/head -> origin/pr/946/head
 * [new ref] refs/pull/946/merge -> origin/pr/946/merge
 * [new ref] refs/pull/947/head -> origin/pr/947/head
 * [new ref] refs/pull/947/merge -> origin/pr/947/merge
 * [new ref] refs/pull/948/head -> origin/pr/948/head
 * [new ref] refs/pull/948/merge -> origin/pr/948/merge
 * [new ref] refs/pull/949/head -> origin/pr/949/head
 * [new ref] refs/pull/949/merge -> origin/pr/949/merge
 * [new ref] refs/pull/95/head -> origin/pr/95/head
 * [new ref] refs/pull/95/merge -> origin/pr/95/merge
 * [new ref] refs/pull/950/head -> origin/pr/950/head
 * [new ref] refs/pull/951/head -> origin/pr/951/head
 * [new ref] refs/pull/951/merge -> origin/pr/951/merge
 * [new ref] refs/pull/952/head -> origin/pr/952/head
 * [new ref] refs/pull/952/merge -> origin/pr/952/merge
 * [new ref] refs/pull/953/head -> origin/pr/953/head
 * [new ref] refs/pull/954/head -> origin/pr/954/head
 * [new ref] refs/pull/954/merge -> origin/pr/954/merge
 * [new ref] refs/pull/955/head -> origin/pr/955/head
 * [new ref] refs/pull/955/merge -> origin/pr/955/merge
 * [new ref] refs/pull/956/head -> origin/pr/956/head
 * [new ref] refs/pull/957/head -> origin/pr/957/head
 * [new ref] refs/pull/958/head -> origin/pr/958/head
 * [new ref] refs/pull/959/head -> origin/pr/959/head
 * [new ref] refs/pull/959/merge -> origin/pr/959/merge
 * [new ref] refs/pull/96/head -> origin/pr/96/head
 * [new ref] refs/pull/96/merge -> origin/pr/96/merge
 * [new ref] refs/pull/960/head -> origin/pr/960/head
 * [new ref] refs/pull/960/merge -> origin/pr/960/merge
 * [new ref] refs/pull/961/head -> origin/pr/961/head
 * [new ref] refs/pull/962/head -> origin/pr/962/head
 * [new ref] refs/pull/962/merge -> origin/pr/962/merge
 * [new ref] refs/pull/963/head -> origin/pr/963/head
 * [new ref] refs/pull/963/merge -> origin/pr/963/merge
 * [new ref] refs/pull/964/head -> origin/pr/964/head
 * [new ref] refs/pull/965/head -> origin/pr/965/head
 * [new ref] refs/pull/965/merge -> origin/pr/965/merge
 * [new ref] refs/pull/966/head -> origin/pr/966/head
 * [new ref] refs/pull/967/head -> origin/pr/967/head
 * [new ref] refs/pull/967/merge -> origin/pr/967/merge
 * [new ref] refs/pull/968/head -> origin/pr/968/head
 * [new ref] refs/pull/968/merge -> origin/pr/968/merge
 * [new ref] refs/pull/969/head -> origin/pr/969/head
 * [new ref] refs/pull/969/merge -> origin/pr/969/merge
 * [new ref] refs/pull/97/head -> origin/pr/97/head
 * [new ref] refs/pull/97/merge -> origin/pr/97/merge
 * [new ref] refs/pull/970/head -> origin/pr/970/head
 * [new ref] refs/pull/970/merge -> origin/pr/970/merge
 * [new ref] refs/pull/971/head -> origin/pr/971/head
 * [new ref] refs/pull/971/merge -> origin/pr/971/merge
 * [new ref] refs/pull/972/head -> origin/pr/972/head
 * [new ref] refs/pull/973/head -> 

Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Spark #1434

2017-03-29 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #2352: Add first two mobile gaming examples to Python

2017-03-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2352


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: Add first two mobile gaming examples to Python.

2017-03-29 Thread altay
Repository: beam
Updated Branches:
  refs/heads/master 5d460d2e9 -> 4d633bc5a


Add first two mobile gaming examples to Python.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/d43391da
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/d43391da
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/d43391da

Branch: refs/heads/master
Commit: d43391da7ed778baee0b03fdd5e1171c6516d4f5
Parents: 5d460d2
Author: Ahmet Altay 
Authored: Tue Mar 28 17:45:11 2017 -0700
Committer: Ahmet Altay 
Committed: Wed Mar 29 16:44:51 2017 -0700

--
 .../examples/complete/game/README.md|  69 +
 .../examples/complete/game/__init__.py  |  16 +
 .../examples/complete/game/hourly_team_score.py | 294 +++
 .../complete/game/hourly_team_score_test.py |  52 
 .../examples/complete/game/user_score.py| 219 ++
 .../examples/complete/game/user_score_test.py   |  49 
 6 files changed, 699 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/d43391da/sdks/python/apache_beam/examples/complete/game/README.md
--
diff --git a/sdks/python/apache_beam/examples/complete/game/README.md 
b/sdks/python/apache_beam/examples/complete/game/README.md
new file mode 100644
index 000..39677e4
--- /dev/null
+++ b/sdks/python/apache_beam/examples/complete/game/README.md
@@ -0,0 +1,69 @@
+
+# 'Gaming' examples
+
+This directory holds a series of example Dataflow pipelines in a simple 'mobile
+gaming' domain. Each pipeline successively introduces new concepts.
+
+In the gaming scenario, many users play, as members of different teams, over
+the course of a day, and their actions are logged for processing. Some of the
+logged game events may be late-arriving, if users play on mobile devices and go
+transiently offline for a period.
+
+The scenario includes not only "regular" users, but "robot users", which have a
+higher click rate than the regular users, and may move from team to team.
+
+The first two pipelines in the series use pre-generated batch data samples.
+
+All of these pipelines write their results to Google BigQuery table(s).
+
+## The pipelines in the 'gaming' series
+
+### user_score
+
+The first pipeline in the series is `user_score`. This pipeline does batch
+processing of data collected from gaming events. It calculates the sum of
+scores per user, over an entire batch of gaming data (collected, say, for each
+day). The batch processing will not include any late data that arrives after
+the day's cutoff point.
+
+### hourly_team_score
+
+The next pipeline in the series is `hourly_team_score`. This pipeline also
+processes data collected from gaming events in batch. It builds on 
`user_score`,
+but uses [fixed 
windows](https://beam.apache.org/documentation/programming-guide/#windowing),
+by default an hour in duration. It calculates the sum of scores per team, for
+each window, optionally allowing specification of two timestamps before and
+after which data is filtered out. This allows a model where late data collected
+after the intended analysis window can be included in the analysis, and any
+late-arriving data prior to the beginning of the analysis window can be removed
+as well.
+
+By using windowing and adding element timestamps, we can do finer-grained
+analysis than with the `UserScore` pipeline — we're now tracking scores for
+each hour rather than over the course of a whole day. However, our batch
+processing is high-latency, in that we don't get results from plays at the
+beginning of the batch's time period until the complete batch is processed.
+
+## Viewing the results in BigQuery
+
+All of the pipelines write their results to BigQuery. `user_score` and
+`hourly_team_score` each write one table. The pipelines have default table 
names
+that you can override when you start up the pipeline if those tables already
+exist.

http://git-wip-us.apache.org/repos/asf/beam/blob/d43391da/sdks/python/apache_beam/examples/complete/game/__init__.py
--
diff --git a/sdks/python/apache_beam/examples/complete/game/__init__.py 
b/sdks/python/apache_beam/examples/complete/game/__init__.py
new file mode 100644
index 000..cce3aca
--- /dev/null
+++ b/sdks/python/apache_beam/examples/complete/game/__init__.py
@@ -0,0 +1,16 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# 

[2/2] beam git commit: This closes #2352

2017-03-29 Thread altay
This closes #2352


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/4d633bc5
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/4d633bc5
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/4d633bc5

Branch: refs/heads/master
Commit: 4d633bc5a2679d740efcdf3bbe3350ebcfbbd587
Parents: 5d460d2 d43391d
Author: Ahmet Altay 
Authored: Wed Mar 29 16:44:54 2017 -0700
Committer: Ahmet Altay 
Committed: Wed Mar 29 16:44:54 2017 -0700

--
 .../examples/complete/game/README.md|  69 +
 .../examples/complete/game/__init__.py  |  16 +
 .../examples/complete/game/hourly_team_score.py | 294 +++
 .../complete/game/hourly_team_score_test.py |  52 
 .../examples/complete/game/user_score.py| 219 ++
 .../examples/complete/game/user_score_test.py   |  49 
 6 files changed, 699 insertions(+)
--




[jira] [Commented] (BEAM-1835) NPE in DirectRunner PubsubReader.ackBatch

2017-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948096#comment-15948096
 ] 

ASF GitHub Bot commented on BEAM-1835:
--

GitHub user ravwojdyla opened a pull request:

https://github.com/apache/beam/pull/2365

[BEAM-1835] NPE pubsub

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ravwojdyla/incubator-beam npe_pubsub

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2365.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2365


commit d3e3aeb117ec1fb8ed0c8b834d32da659c83d099
Author: Rafal Wojdyla 
Date:   2017-03-27T18:40:10Z

Fix typo

commit 1bab5cce57b56727495355c4d9a2ab1e1f122fbf
Author: Rafal Wojdyla 
Date:   2017-03-29T23:11:19Z

Never reuse reader for direct pipeline runner




> NPE in DirectRunner PubsubReader.ackBatch
> -
>
> Key: BEAM-1835
> URL: https://issues.apache.org/jira/browse/BEAM-1835
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct, sdk-java-core
>Reporter: Rafal Wojdyla
>Assignee: Rafal Wojdyla
>
> DirectRunner streaming mode throws null pointer exception:
> {noformat}
> org.apache.beam.sdk.io.PubsubUnboundedSource$PubsubReader.ackBatch(PubsubUnboundedSource.java:639)
>   at 
> org.apache.beam.sdk.io.PubsubUnboundedSource$PubsubCheckpoint.finalizeCheckpoint(PubsubUnboundedSource.java:312)
>   at 
> org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.finishRead(UnboundedReadEvaluatorFactory.java:223)
>   at 
> org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.processElement(UnboundedReadEvaluatorFactory.java:144)
>   at 
> org.apache.beam.runners.direct.TransformExecutor.processElements(TransformExecutor.java:139)
>   at 
> org.apache.beam.runners.direct.TransformExecutor.run(TransformExecutor.java:107)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This does not happen always, but for large enough number of events it's 
> pretty reproducible. The problems seems to be the concurrent reuse of a 
> reader among multiple threads, and a race condition, when one of the threads 
> "decided" to close the reader, based on:
> {code}
> private static final double DEFAULT_READER_REUSE_CHANCE = 0.95;
> {code}
> the close, nulls out pubsub client:
> {code}
> @Override
> public void close() throws IOException {
>   if (pubsubClient != null) {
> pubsubClient.close();
> pubsubClient = null;
>   }
> }
> {code}
> which if still in use by other thread will result in NPE above.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2365: [BEAM-1835] NPE pubsub

2017-03-29 Thread ravwojdyla
GitHub user ravwojdyla opened a pull request:

https://github.com/apache/beam/pull/2365

[BEAM-1835] NPE pubsub

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ravwojdyla/incubator-beam npe_pubsub

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2365.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2365


commit d3e3aeb117ec1fb8ed0c8b834d32da659c83d099
Author: Rafal Wojdyla 
Date:   2017-03-27T18:40:10Z

Fix typo

commit 1bab5cce57b56727495355c4d9a2ab1e1f122fbf
Author: Rafal Wojdyla 
Date:   2017-03-29T23:11:19Z

Never reuse reader for direct pipeline runner




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (BEAM-1835) NPE in DirectRunner PubsubReader.ackBatch

2017-03-29 Thread Rafal Wojdyla (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rafal Wojdyla updated BEAM-1835:

Summary: NPE in DirectRunner PubsubReader.ackBatch  (was: NPE in 
DirectRunner )

> NPE in DirectRunner PubsubReader.ackBatch
> -
>
> Key: BEAM-1835
> URL: https://issues.apache.org/jira/browse/BEAM-1835
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct, sdk-java-core
>Reporter: Rafal Wojdyla
>Assignee: Rafal Wojdyla
>
> DirectRunner streaming mode throws null pointer exception:
> {noformat}
> org.apache.beam.sdk.io.PubsubUnboundedSource$PubsubReader.ackBatch(PubsubUnboundedSource.java:639)
>   at 
> org.apache.beam.sdk.io.PubsubUnboundedSource$PubsubCheckpoint.finalizeCheckpoint(PubsubUnboundedSource.java:312)
>   at 
> org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.finishRead(UnboundedReadEvaluatorFactory.java:223)
>   at 
> org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.processElement(UnboundedReadEvaluatorFactory.java:144)
>   at 
> org.apache.beam.runners.direct.TransformExecutor.processElements(TransformExecutor.java:139)
>   at 
> org.apache.beam.runners.direct.TransformExecutor.run(TransformExecutor.java:107)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This does not happen always, but for large enough number of events it's 
> pretty reproducible. The problems seems to be the concurrent reuse of a 
> reader among multiple threads, and a race condition, when one of the threads 
> "decided" to close the reader, based on:
> {code}
> private static final double DEFAULT_READER_REUSE_CHANCE = 0.95;
> {code}
> the close, nulls out pubsub client:
> {code}
> @Override
> public void close() throws IOException {
>   if (pubsubClient != null) {
> pubsubClient.close();
> pubsubClient = null;
>   }
> }
> {code}
> which if still in use by other thread will result in NPE above.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-1835) NPE in DirectRunner

2017-03-29 Thread Rafal Wojdyla (JIRA)
Rafal Wojdyla created BEAM-1835:
---

 Summary: NPE in DirectRunner 
 Key: BEAM-1835
 URL: https://issues.apache.org/jira/browse/BEAM-1835
 Project: Beam
  Issue Type: Bug
  Components: runner-direct, sdk-java-core
Reporter: Rafal Wojdyla
Assignee: Rafal Wojdyla


DirectRunner streaming mode throws null pointer exception:

{noformat}
org.apache.beam.sdk.io.PubsubUnboundedSource$PubsubReader.ackBatch(PubsubUnboundedSource.java:639)
  at 
org.apache.beam.sdk.io.PubsubUnboundedSource$PubsubCheckpoint.finalizeCheckpoint(PubsubUnboundedSource.java:312)
  at 
org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.finishRead(UnboundedReadEvaluatorFactory.java:223)
  at 
org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.processElement(UnboundedReadEvaluatorFactory.java:144)
  at 
org.apache.beam.runners.direct.TransformExecutor.processElements(TransformExecutor.java:139)
  at 
org.apache.beam.runners.direct.TransformExecutor.run(TransformExecutor.java:107)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  at java.lang.Thread.run(Thread.java:745)
{noformat}

This does not happen always, but for large enough number of events it's pretty 
reproducible. The problems seems to be the concurrent reuse of a reader among 
multiple threads, and a race condition, when one of the threads "decided" to 
close the reader, based on:

{code}
private static final double DEFAULT_READER_REUSE_CHANCE = 0.95;
{code}

the close, nulls out pubsub client:

{code}
@Override
public void close() throws IOException {
  if (pubsubClient != null) {
pubsubClient.close();
pubsubClient = null;
  }
}
{code}

which if still in use by other thread will result in NPE above.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: beam_PostCommit_Python_Verify #1667

2017-03-29 Thread Apache Jenkins Server
See 


--
[...truncated 218.30 KB...]
 x [deleted] (none) -> origin/pr/942/head
 x [deleted] (none) -> origin/pr/942/merge
 x [deleted] (none) -> origin/pr/943/head
 x [deleted] (none) -> origin/pr/943/merge
 x [deleted] (none) -> origin/pr/944/head
 x [deleted] (none) -> origin/pr/945/head
 x [deleted] (none) -> origin/pr/945/merge
 x [deleted] (none) -> origin/pr/946/head
 x [deleted] (none) -> origin/pr/946/merge
 x [deleted] (none) -> origin/pr/947/head
 x [deleted] (none) -> origin/pr/947/merge
 x [deleted] (none) -> origin/pr/948/head
 x [deleted] (none) -> origin/pr/948/merge
 x [deleted] (none) -> origin/pr/949/head
 x [deleted] (none) -> origin/pr/949/merge
 x [deleted] (none) -> origin/pr/95/head
 x [deleted] (none) -> origin/pr/95/merge
 x [deleted] (none) -> origin/pr/950/head
 x [deleted] (none) -> origin/pr/951/head
 x [deleted] (none) -> origin/pr/951/merge
 x [deleted] (none) -> origin/pr/952/head
 x [deleted] (none) -> origin/pr/952/merge
 x [deleted] (none) -> origin/pr/953/head
 x [deleted] (none) -> origin/pr/954/head
 x [deleted] (none) -> origin/pr/954/merge
 x [deleted] (none) -> origin/pr/955/head
 x [deleted] (none) -> origin/pr/955/merge
 x [deleted] (none) -> origin/pr/956/head
 x [deleted] (none) -> origin/pr/957/head
 x [deleted] (none) -> origin/pr/958/head
 x [deleted] (none) -> origin/pr/959/head
 x [deleted] (none) -> origin/pr/959/merge
 x [deleted] (none) -> origin/pr/96/head
 x [deleted] (none) -> origin/pr/96/merge
 x [deleted] (none) -> origin/pr/960/head
 x [deleted] (none) -> origin/pr/960/merge
 x [deleted] (none) -> origin/pr/961/head
 x [deleted] (none) -> origin/pr/962/head
 x [deleted] (none) -> origin/pr/962/merge
 x [deleted] (none) -> origin/pr/963/head
 x [deleted] (none) -> origin/pr/963/merge
 x [deleted] (none) -> origin/pr/964/head
 x [deleted] (none) -> origin/pr/965/head
 x [deleted] (none) -> origin/pr/965/merge
 x [deleted] (none) -> origin/pr/966/head
 x [deleted] (none) -> origin/pr/967/head
 x [deleted] (none) -> origin/pr/967/merge
 x [deleted] (none) -> origin/pr/968/head
 x [deleted] (none) -> origin/pr/968/merge
 x [deleted] (none) -> origin/pr/969/head
 x [deleted] (none) -> origin/pr/969/merge
 x [deleted] (none) -> origin/pr/97/head
 x [deleted] (none) -> origin/pr/97/merge
 x [deleted] (none) -> origin/pr/970/head
 x [deleted] (none) -> origin/pr/970/merge
 x [deleted] (none) -> origin/pr/971/head
 x [deleted] (none) -> origin/pr/971/merge
 x [deleted] (none) -> origin/pr/972/head
 x [deleted] (none) -> origin/pr/973/head
 x [deleted] (none) -> origin/pr/974/head
 x [deleted] (none) -> origin/pr/974/merge
 x [deleted] (none) -> origin/pr/975/head
 x [deleted] (none) -> origin/pr/975/merge
 x [deleted] (none) -> origin/pr/976/head
 x [deleted] (none) -> origin/pr/976/merge
 x [deleted] (none) -> origin/pr/977/head
 x [deleted] (none) -> origin/pr/977/merge
 x [deleted] (none) -> origin/pr/978/head
 x [deleted] (none) -> origin/pr/978/merge
 x [deleted] (none) -> origin/pr/979/head
 x [deleted] (none) -> origin/pr/979/merge
 x [deleted] (none) -> origin/pr/98/head
 x [deleted] (none) -> origin/pr/980/head
 x [deleted] (none) -> origin/pr/980/merge
 x [deleted] (none) -> origin/pr/981/head
 x [deleted] (none) -> origin/pr/982/head
 x [deleted] (none) -> origin/pr/982/merge
 x [deleted] (none) -> origin/pr/983/head
 x [deleted] (none) -> origin/pr/983/merge
 x [deleted] (none) -> origin/pr/984/head
 x [deleted] (none) -> origin/pr/984/merge
 x [deleted] (none) -> origin/pr/985/head
 x [deleted] (none) -> origin/pr/985/merge
 x [deleted] (none) -> origin/pr/986/head
 x [deleted] (none) -> origin/pr/986/merge
 x [deleted] (none) -> origin/pr/987/head
 x [deleted] (none) -> origin/pr/988/head
 x [deleted] (none) -> origin/pr/988/merge
 x [deleted] (none) -> origin/pr/989/head
 

[GitHub] beam pull request #2360: Remove View indicator classes

2017-03-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2360


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: This closes #2360

2017-03-29 Thread tgroh
This closes #2360


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/5d460d2e
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/5d460d2e
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/5d460d2e

Branch: refs/heads/master
Commit: 5d460d2e986fc9b549260bafcb2e26249f2ed3bb
Parents: 8fa1159 6933465
Author: Thomas Groh 
Authored: Wed Mar 29 16:25:54 2017 -0700
Committer: Thomas Groh 
Committed: Wed Mar 29 16:25:54 2017 -0700

--
 .../apache/beam/sdk/util/PCollectionViews.java  | 148 +++
 1 file changed, 18 insertions(+), 130 deletions(-)
--




[1/2] beam git commit: Remove View indicator classes

2017-03-29 Thread tgroh
Repository: beam
Updated Branches:
  refs/heads/master 8fa115955 -> 5d460d2e9


Remove View indicator classes

With the latest Dataflow Worker, these indicator classes have been
replaced everywhere by referencing the available ViewFns.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/6933465d
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/6933465d
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/6933465d

Branch: refs/heads/master
Commit: 6933465d659d6addeb30bbd5042c7b96834e418c
Parents: 8fa1159
Author: Thomas Groh 
Authored: Wed Mar 29 09:53:38 2017 -0700
Committer: Thomas Groh 
Committed: Wed Mar 29 13:16:31 2017 -0700

--
 .../apache/beam/sdk/util/PCollectionViews.java  | 148 +++
 1 file changed, 18 insertions(+), 130 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/6933465d/sdks/java/core/src/main/java/org/apache/beam/sdk/util/PCollectionViews.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/util/PCollectionViews.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/util/PCollectionViews.java
index 848d8b3..0794703 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/util/PCollectionViews.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/util/PCollectionViews.java
@@ -65,14 +65,11 @@ public class PCollectionViews {
   boolean hasDefault,
   @Nullable T defaultValue,
   Coder valueCoder) {
-// TODO: as soon as runners are ported off the indicator classes,
-// return new SimplePCollectionView<>(
-//pipeline,
-//new SingletonViewFn(hasDefault, defaultValue, valueCoder),
-//windowingStrategy,
-//valueCoder);
-return new SingletonPCollectionView<>(
-pipeline, windowingStrategy, hasDefault, defaultValue, valueCoder);
+ return new SimplePCollectionView<>(
+pipeline,
+new SingletonViewFn<>(hasDefault, defaultValue, valueCoder),
+windowingStrategy,
+valueCoder);
   }
 
   /**
@@ -83,10 +80,8 @@ public class PCollectionViews {
   Pipeline pipeline,
   WindowingStrategy windowingStrategy,
   Coder valueCoder) {
-// TODO: as soon as runners are ported off the indicator classes,
-// return new SimplePCollectionView<>(
-//pipeline, new IterableViewFn(), windowingStrategy, valueCoder);
-return new IterablePCollectionView<>(pipeline, windowingStrategy, 
valueCoder);
+return new SimplePCollectionView<>(
+pipeline, new IterableViewFn(), windowingStrategy, valueCoder);
   }
 
   /**
@@ -97,10 +92,8 @@ public class PCollectionViews {
   Pipeline pipeline,
   WindowingStrategy windowingStrategy,
   Coder valueCoder) {
-// TODO: as soon as runners are ported off the indicator classes,
-// return new SimplePCollectionView<>(
-//pipeline, new ListViewFn(), windowingStrategy, valueCoder);
-return new ListPCollectionView<>(pipeline, windowingStrategy, valueCoder);
+ return new SimplePCollectionView<>(
+pipeline, new ListViewFn(), windowingStrategy, valueCoder);
   }
 
   /**
@@ -108,13 +101,9 @@ public class PCollectionViews {
* provided {@link Coder} and windowed using the provided {@link 
WindowingStrategy}.
*/
   public static  PCollectionView> 
mapView(
-  Pipeline pipeline,
-  WindowingStrategy windowingStrategy,
-  Coder> valueCoder) {
-// TODO: as soon as runners are ported off the indicator classes,
-// return new SimplePCollectionView<>(
-//pipeline, new MapViewFn(), windowingStrategy, valueCoder);
-return new MapPCollectionView<>(pipeline, windowingStrategy, valueCoder);
+  Pipeline pipeline, WindowingStrategy windowingStrategy, 
Coder> valueCoder) {
+return new SimplePCollectionView<>(
+pipeline, new MapViewFn(), windowingStrategy, valueCoder);
   }
 
   /**
@@ -125,105 +114,10 @@ public class PCollectionViews {
   Pipeline pipeline,
   WindowingStrategy windowingStrategy,
   Coder> valueCoder) {
-// TODO: as soon as runners are ported off the indicator classes,
-// return new SimplePCollectionView<>(
-//pipeline, new MultimapViewFn(), windowingStrategy, valueCoder);
-return new MultimapPCollectionView<>(pipeline, windowingStrategy, 
valueCoder);
-  }
-
-  /**
-   * A public indicator class that this view is a singleton view.
-   *
-   * @deprecated Runners should not inspect the {@link PCollectionView} 
subclass, as it is an
-   * implementation detail. To specialize a side input, a runner should 
inspect the
-   * language-independent metadata of the {@link 

[jira] [Commented] (BEAM-1834) Bigquery Write validation doesn't work well with ValueInSingleWindow

2017-03-29 Thread Kevin Peterson (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948060#comment-15948060
 ] 

Kevin Peterson commented on BEAM-1834:
--

Gotcha. Yes, I'd like to request the ability to set data dependent schemas.

OR just turn off the 
[validation|https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L952]
 that prevents me from using CREATE_NEVER - I generate all of the tables before 
starting the pipeline, so I never actually need to create them.  

> Bigquery Write validation doesn't work well with ValueInSingleWindow
> 
>
> Key: BEAM-1834
> URL: https://issues.apache.org/jira/browse/BEAM-1834
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Kevin Peterson
>
> I am using the new {{Write to(SerializableFunction String> tableSpecFunction)}} function to write data to different Bigquery 
> tables depending on the values. I'm my case, the values can have a different 
> schema (it starts as an {{Any}} encoded protobuf, which I parse and expand to 
> a {{TableRow}} object).
> Since the tables have different schemas, the existing implementation of 
> {{withSchema}} doesn't work.
> Some options:
> # Allow {{CreateDisposition.CREATE_NEVER}} in this situation. Failed inserts 
> from a missing table just fail (and eventually pass through via BEAM-190).
> # Add a new {{withSchema(SerializableFunction TableSchema>}} function.
> I think eventually both of the above should be allowable configurations, but 
> just one will unblock my current error. Happy to implement, given some 
> guidance on design preferences.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1834) Bigquery Write validation doesn't work well with ValueInSingleWindow

2017-03-29 Thread Daniel Halperin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948040#comment-15948040
 ] 

Daniel Halperin commented on BEAM-1834:
---

Yeah, I meant "former". Sorry. We have tables, not schemas.

> Bigquery Write validation doesn't work well with ValueInSingleWindow
> 
>
> Key: BEAM-1834
> URL: https://issues.apache.org/jira/browse/BEAM-1834
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Kevin Peterson
>
> I am using the new {{Write to(SerializableFunction String> tableSpecFunction)}} function to write data to different Bigquery 
> tables depending on the values. I'm my case, the values can have a different 
> schema (it starts as an {{Any}} encoded protobuf, which I parse and expand to 
> a {{TableRow}} object).
> Since the tables have different schemas, the existing implementation of 
> {{withSchema}} doesn't work.
> Some options:
> # Allow {{CreateDisposition.CREATE_NEVER}} in this situation. Failed inserts 
> from a missing table just fail (and eventually pass through via BEAM-190).
> # Add a new {{withSchema(SerializableFunction TableSchema>}} function.
> I think eventually both of the above should be allowable configurations, but 
> just one will unblock my current error. Happy to implement, given some 
> guidance on design preferences.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (BEAM-1834) Bigquery Write validation doesn't work well with ValueInSingleWindow

2017-03-29 Thread Daniel Halperin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948014#comment-15948014
 ] 

Daniel Halperin edited comment on BEAM-1834 at 3/29/17 10:52 PM:
-

It sounds like this a feature request for BigQueryIO supporting Data-dependent 
Schemas in addition to Data-dependent tables?

The former is explicitly not supported at this time.


was (Author: dhalp...@google.com):
It sounds like this a feature request for BigQueryIO supporting Data-dependent 
Schemas in addition to Data-dependent tables?

The latter is explicitly not supported at this time.

> Bigquery Write validation doesn't work well with ValueInSingleWindow
> 
>
> Key: BEAM-1834
> URL: https://issues.apache.org/jira/browse/BEAM-1834
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Kevin Peterson
>
> I am using the new {{Write to(SerializableFunction String> tableSpecFunction)}} function to write data to different Bigquery 
> tables depending on the values. I'm my case, the values can have a different 
> schema (it starts as an {{Any}} encoded protobuf, which I parse and expand to 
> a {{TableRow}} object).
> Since the tables have different schemas, the existing implementation of 
> {{withSchema}} doesn't work.
> Some options:
> # Allow {{CreateDisposition.CREATE_NEVER}} in this situation. Failed inserts 
> from a missing table just fail (and eventually pass through via BEAM-190).
> # Add a new {{withSchema(SerializableFunction TableSchema>}} function.
> I think eventually both of the above should be allowable configurations, but 
> just one will unblock my current error. Happy to implement, given some 
> guidance on design preferences.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-1648) Replace gsutil calls with Cloud Storage API

2017-03-29 Thread Ahmet Altay (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay resolved BEAM-1648.
---
   Resolution: Fixed
Fix Version/s: First stable release

[~dlebech] (Feel free to close issues once the PR is merged.)

> Replace gsutil calls with Cloud Storage API
> ---
>
> Key: BEAM-1648
> URL: https://issues.apache.org/jira/browse/BEAM-1648
> Project: Beam
>  Issue Type: Wish
>  Components: sdk-py
>Affects Versions: 0.5.0
>Reporter: David Volquartz Lebech
>Assignee: Ahmet Altay
> Fix For: First stable release
>
>
> When using the {{DataflowRunner}} and {{--setup-file}} parameter, {{gsutil}} 
> is used for _some_ of the Cloud Storage uploads 
> [here|https://github.com/apache/beam/blob/466599d765aa82acaf997ec8776405152bbde4c1/sdks/python/apache_beam/runners/dataflow/internal/dependency.py#L89-L90].
>  This makes it difficult to run a pipeline in an environment where the Cloud 
> Platform tools are not installed -- e.g. a Docker Python container or a 
> Heroku instance.
> The Storage API is used in other places such as 
> [here|https://github.com/apache/beam/blob/466599d765aa82acaf997ec8776405152bbde4c1/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py#L431-L432]
>  when staging the session and main SDK package, so I'm unsure if this is by 
> design or an inconsistency in the usage.
> Thank you for considering this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1834) Bigquery Write validation doesn't work well with ValueInSingleWindow

2017-03-29 Thread Kevin Peterson (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948020#comment-15948020
 ] 

Kevin Peterson commented on BEAM-1834:
--

> The latter is explicitly not supported at this time.

Support for data-dependent tables was recently added on master: 
https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L825

> Bigquery Write validation doesn't work well with ValueInSingleWindow
> 
>
> Key: BEAM-1834
> URL: https://issues.apache.org/jira/browse/BEAM-1834
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Kevin Peterson
>
> I am using the new {{Write to(SerializableFunction String> tableSpecFunction)}} function to write data to different Bigquery 
> tables depending on the values. I'm my case, the values can have a different 
> schema (it starts as an {{Any}} encoded protobuf, which I parse and expand to 
> a {{TableRow}} object).
> Since the tables have different schemas, the existing implementation of 
> {{withSchema}} doesn't work.
> Some options:
> # Allow {{CreateDisposition.CREATE_NEVER}} in this situation. Failed inserts 
> from a missing table just fail (and eventually pass through via BEAM-190).
> # Add a new {{withSchema(SerializableFunction TableSchema>}} function.
> I think eventually both of the above should be allowable configurations, but 
> just one will unblock my current error. Happy to implement, given some 
> guidance on design preferences.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1834) Bigquery Write validation doesn't work well with ValueInSingleWindow

2017-03-29 Thread Daniel Halperin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948014#comment-15948014
 ] 

Daniel Halperin commented on BEAM-1834:
---

It sounds like this a feature request for BigQueryIO supporting Data-dependent 
Schemas in addition to Data-dependent tables?

The latter is explicitly not supported at this time.

> Bigquery Write validation doesn't work well with ValueInSingleWindow
> 
>
> Key: BEAM-1834
> URL: https://issues.apache.org/jira/browse/BEAM-1834
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Kevin Peterson
>Assignee: Daniel Halperin
>
> I am using the new {{Write to(SerializableFunction String> tableSpecFunction)}} function to write data to different Bigquery 
> tables depending on the values. I'm my case, the values can have a different 
> schema (it starts as an {{Any}} encoded protobuf, which I parse and expand to 
> a {{TableRow}} object).
> Since the tables have different schemas, the existing implementation of 
> {{withSchema}} doesn't work.
> Some options:
> # Allow {{CreateDisposition.CREATE_NEVER}} in this situation. Failed inserts 
> from a missing table just fail (and eventually pass through via BEAM-190).
> # Add a new {{withSchema(SerializableFunction TableSchema>}} function.
> I think eventually both of the above should be allowable configurations, but 
> just one will unblock my current error. Happy to implement, given some 
> guidance on design preferences.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (BEAM-1834) Bigquery Write validation doesn't work well with ValueInSingleWindow

2017-03-29 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin reassigned BEAM-1834:
-

Assignee: (was: Daniel Halperin)

> Bigquery Write validation doesn't work well with ValueInSingleWindow
> 
>
> Key: BEAM-1834
> URL: https://issues.apache.org/jira/browse/BEAM-1834
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Kevin Peterson
>
> I am using the new {{Write to(SerializableFunction String> tableSpecFunction)}} function to write data to different Bigquery 
> tables depending on the values. I'm my case, the values can have a different 
> schema (it starts as an {{Any}} encoded protobuf, which I parse and expand to 
> a {{TableRow}} object).
> Since the tables have different schemas, the existing implementation of 
> {{withSchema}} doesn't work.
> Some options:
> # Allow {{CreateDisposition.CREATE_NEVER}} in this situation. Failed inserts 
> from a missing table just fail (and eventually pass through via BEAM-190).
> # Add a new {{withSchema(SerializableFunction TableSchema>}} function.
> I think eventually both of the above should be allowable configurations, but 
> just one will unblock my current error. Happy to implement, given some 
> guidance on design preferences.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-1750) Refactor BigQueryIO.java into multiple files

2017-03-29 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin resolved BEAM-1750.
---
   Resolution: Fixed
Fix Version/s: Not applicable

> Refactor BigQueryIO.java into multiple files
> 
>
> Key: BEAM-1750
> URL: https://issues.apache.org/jira/browse/BEAM-1750
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Reuven Lax
> Fix For: Not applicable
>
>
> BigQueryIO.java has become almost unmanageably large. We should refactor this 
> into several files.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (BEAM-437) Data-dependent BigQueryIO in batch

2017-03-29 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin reassigned BEAM-437:


Assignee: Reuven Lax

> Data-dependent BigQueryIO in batch
> --
>
> Key: BEAM-437
> URL: https://issues.apache.org/jira/browse/BEAM-437
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Daniel Halperin
>Assignee: Reuven Lax
>Priority: Minor
>
> Blocked by [BEAM-92].
> Right now, we use BigQuery's streaming write API when using window-dependent 
> tables in BigQuery. We should
> 1. Support data-dependent tables as well.
> 2. Find a way to use the batch write API.
> 3. This requires careful design to be idempotent or, at least, as close to 
> idempotent as possible.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1269) BigtableIO should make more efficient use of connections

2017-03-29 Thread Solomon Duskis (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15948013#comment-15948013
 ] 

Solomon Duskis commented on BEAM-1269:
--

BigtableIO should not set data channel pool counts for reads.  This is the 
current line:

  // Set data channel count to one because there is only 1 scanner in this 
session
  BigtableOptions.Builder clonedBuilder = options.toBuilder()
  .setDataChannelCount(1);
  BigtableOptions optionsWithAgent =
  clonedBuilder.setUserAgent(getBeamSdkPartOfUserAgent()).build();

It should be more like:

 BigtableOptions optionsWithAgent = options
 .toBuilder()
 .setUserAgent(getBeamSdkPartOfUserAgent())
 . setUseCachedDataPool(true)
 . setDataHost(BigtableOptions.BIGTABLE_BATCH_DATA_HOST_DEFAULT)
 .build();


> BigtableIO should make more efficient use of connections
> 
>
> Key: BEAM-1269
> URL: https://issues.apache.org/jira/browse/BEAM-1269
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-gcp
>Reporter: Daniel Halperin
>  Labels: newbie, starter
>
> RIght now, {{BigtableIO}} opens up a new Bigtable session for every DoFn, in 
> the {{@Setup}} function. However, sessions can support multiple connections, 
> so perhaps this code should be modified to open up a smaller session pool and 
> then allocation connections in {{@StartBundle}}.
> This would likely make more efficient use of resources, especially for highly 
> multithreaded workers.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-1530) BigQueryIO should support value-dependent windows

2017-03-29 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin resolved BEAM-1530.
---
   Resolution: Fixed
Fix Version/s: First stable release

> BigQueryIO should support value-dependent windows
> -
>
> Key: BEAM-1530
> URL: https://issues.apache.org/jira/browse/BEAM-1530
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Reuven Lax
> Fix For: First stable release
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1830) add 'withTopic()' api to KafkaIO Reader

2017-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15947985#comment-15947985
 ] 

ASF GitHub Bot commented on BEAM-1830:
--

GitHub user rangadi opened a pull request:

https://github.com/apache/beam/pull/2364

[BEAM-1830] KafkaIO : Add withTopic() api that takes single topic.

Overwhelming uses of KafkaIO consume just one topic. It would be nice to 
have `withTopic(topic)` rather than always requiring a list. 

In addition, I am fixing a small bug I noticed while using 
`KafkaIO.write().values()`.  Remove need for setting key coder for Writer while 
writing values only. If we didn't specifiy the key coder, validation succeeded 
but it failed a check while instantiating Kafka producer.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rangadi/incubator-beam with_topic

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2364.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2364


commit d958c7bdb14ea42f80b955a94fd9d4f6b1b870fc
Author: Raghu Angadi 
Date:   2017-03-29T15:17:25Z

KafkaIO : Add withTopic() api that takes single topic.

Remove need for setting key coder for Writer while writing
values only. If we didn't specifiy the key coder, validation
succeeded but it failed a check while instantiating Kafka producer.




> add 'withTopic()' api to KafkaIO Reader
> ---
>
> Key: BEAM-1830
> URL: https://issues.apache.org/jira/browse/BEAM-1830
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Affects Versions: 0.6.0
>Reporter: Raghu Angadi
>Assignee: Raghu Angadi
>Priority: Minor
> Fix For: First stable release
>
>
> Most of instances of KafkaIO readers consume just one topic. The existing 
> method {{withTopics(List topics)}} forces the users to make a list 
> containing single topic.
> It would be simpler to add {{withTopic(String topic)}} method.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2364: [BEAM-1830] KafkaIO : Add withTopic() api that take...

2017-03-29 Thread rangadi
GitHub user rangadi opened a pull request:

https://github.com/apache/beam/pull/2364

[BEAM-1830] KafkaIO : Add withTopic() api that takes single topic.

Overwhelming uses of KafkaIO consume just one topic. It would be nice to 
have `withTopic(topic)` rather than always requiring a list. 

In addition, I am fixing a small bug I noticed while using 
`KafkaIO.write().values()`.  Remove need for setting key coder for Writer while 
writing values only. If we didn't specifiy the key coder, validation succeeded 
but it failed a check while instantiating Kafka producer.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rangadi/incubator-beam with_topic

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2364.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2364


commit d958c7bdb14ea42f80b955a94fd9d4f6b1b870fc
Author: Raghu Angadi 
Date:   2017-03-29T15:17:25Z

KafkaIO : Add withTopic() api that takes single topic.

Remove need for setting key coder for Writer while writing
values only. If we didn't specifiy the key coder, validation
succeeded but it failed a check while instantiating Kafka producer.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (BEAM-1834) Bigquery Write validation doesn't work well with ValueInSingleWindow

2017-03-29 Thread Kevin Peterson (JIRA)
Kevin Peterson created BEAM-1834:


 Summary: Bigquery Write validation doesn't work well with 
ValueInSingleWindow
 Key: BEAM-1834
 URL: https://issues.apache.org/jira/browse/BEAM-1834
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-gcp
Reporter: Kevin Peterson
Assignee: Daniel Halperin


I am using the new {{Write to(SerializableFunction tableSpecFunction)}} function to write data to different Bigquery 
tables depending on the values. I'm my case, the values can have a different 
schema (it starts as an {{Any}} encoded protobuf, which I parse and expand to a 
{{TableRow}} object).

Since the tables have different schemas, the existing implementation of 
{{withSchema}} doesn't work.

Some options:
# Allow {{CreateDisposition.CREATE_NEVER}} in this situation. Failed inserts 
from a missing table just fail (and eventually pass through via BEAM-190).
# Add a new {{withSchema(SerializableFunction}} function.

I think eventually both of the above should be allowable configurations, but 
just one will unblock my current error. Happy to implement, given some guidance 
on design preferences.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (BEAM-1833) Restructure Python pipeline construction to better follow the Runner API

2017-03-29 Thread Ahmet Altay (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay reassigned BEAM-1833:
-

Assignee: Sourabh Bajaj  (was: Ahmet Altay)

> Restructure Python pipeline construction to better follow the Runner API
> 
>
> Key: BEAM-1833
> URL: https://issues.apache.org/jira/browse/BEAM-1833
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Reporter: Robert Bradshaw
>Assignee: Sourabh Bajaj
>
> The most important part is removing the runner.apply overrides, but there are 
> also various other improvements (e.g. all inputs and outputs should be named).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-1833) Restructure Python pipeline construction to better follow the Runner API

2017-03-29 Thread Robert Bradshaw (JIRA)
Robert Bradshaw created BEAM-1833:
-

 Summary: Restructure Python pipeline construction to better follow 
the Runner API
 Key: BEAM-1833
 URL: https://issues.apache.org/jira/browse/BEAM-1833
 Project: Beam
  Issue Type: Improvement
  Components: sdk-py
Reporter: Robert Bradshaw
Assignee: Ahmet Altay


The most important part is removing the runner.apply overrides, but there are 
also various other improvements (e.g. all inputs and outputs should be named).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-1826) Allow BigqueryIO to forward errors

2017-03-29 Thread Kevin Peterson (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Peterson resolved BEAM-1826.
--
   Resolution: Duplicate
Fix Version/s: Not applicable

> Allow BigqueryIO to forward errors
> --
>
> Key: BEAM-1826
> URL: https://issues.apache.org/jira/browse/BEAM-1826
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Kevin Peterson
>Assignee: Daniel Halperin
>Priority: Minor
> Fix For: Not applicable
>
>
> Most sinks are terminal - data ends at the sink. While on occasion the sink 
> may temporarily fail due to resource unavailability, it will eventually 
> succeed. However, some have strict requirements on this input format. In 
> these cases, retries will never succeed, and continuous retrying will 
> eventually lead to pipeline failure.
> The primary use case I have in mind is streaming data to a sink such as 
> BigQuery, where data of the wrong format could fail on insert.
> It would be useful to be able to set a side output or downstream transform 
> from Bigquery which can receive failed rows where retry will never fix the 
> issue, and allow them to be persisted to a different output which is more 
> permissive of the output, to prevent data loss.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1826) Allow BigqueryIO to forward errors

2017-03-29 Thread Kevin Peterson (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15947938#comment-15947938
 ] 

Kevin Peterson commented on BEAM-1826:
--

Yep. https://github.com/apache/beam/pull/1609 looks like it does exactly what I 
want.

> Allow BigqueryIO to forward errors
> --
>
> Key: BEAM-1826
> URL: https://issues.apache.org/jira/browse/BEAM-1826
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Kevin Peterson
>Assignee: Daniel Halperin
>Priority: Minor
>
> Most sinks are terminal - data ends at the sink. While on occasion the sink 
> may temporarily fail due to resource unavailability, it will eventually 
> succeed. However, some have strict requirements on this input format. In 
> these cases, retries will never succeed, and continuous retrying will 
> eventually lead to pipeline failure.
> The primary use case I have in mind is streaming data to a sink such as 
> BigQuery, where data of the wrong format could fail on insert.
> It would be useful to be able to set a side output or downstream transform 
> from Bigquery which can receive failed rows where retry will never fix the 
> issue, and allow them to be persisted to a different output which is more 
> permissive of the output, to prevent data loss.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1418) MapElements and FlatMapElements should comply with PTransform style guide

2017-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15947877#comment-15947877
 ] 

ASF GitHub Bot commented on BEAM-1418:
--

GitHub user jkff opened a pull request:

https://github.com/apache/beam/pull/2363

[BEAM-1418] MapElements and FlatMapElements should comply with PTransform 
style guide

For both of these classes, we change 
(Flat)MapElements.via(fn).withOutputType(td) to 
(Flat)MapElements.into(td).via(fn), which is both shorter and allows getting 
rid of the ugly intermediate and publicly-visible class 
MissingOutputTypeDescriptor.

R: @bjchambers 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/incubator-beam map-style

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2363.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2363


commit c52cddfaf15eb1d5df65bdcbdfb4ee6f167fe0f2
Author: Eugene Kirpichov 
Date:   2017-03-29T20:42:29Z

Removes MapElements.MissingOutputTypeDescriptor

This comes from changing MapElements.via(fn).withOutputType(td)
to MapElements.into(td).via(fn) which is also shorter.

commit 6ae80473f9924fe43c18e02f4a139f6421306c7f
Author: Eugene Kirpichov 
Date:   2017-03-29T20:56:00Z

Removes FlatMapElements.MissingOutputTypeDescriptor

This comes from changing FlatMapElements.via(fn).withOutputType(td)
to FlatMapElements.into(td).via(fn) which is also shorter.




> MapElements and FlatMapElements should comply with PTransform style guide
> -
>
> Key: BEAM-1418
> URL: https://issues.apache.org/jira/browse/BEAM-1418
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Eugene Kirpichov
>  Labels: backward-incompatible, starter
> Fix For: First stable release
>
>
> Type structure of these classes can be improved by slightly changing the API:
> FlatMapElements.into(TypeDescriptor).via(as usual).
> Likewise for MapElements. This allows getting rid of the awkward 
> MissingOutputTypeDescriptor intermediate class.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2363: [BEAM-1418] MapElements and FlatMapElements should ...

2017-03-29 Thread jkff
GitHub user jkff opened a pull request:

https://github.com/apache/beam/pull/2363

[BEAM-1418] MapElements and FlatMapElements should comply with PTransform 
style guide

For both of these classes, we change 
(Flat)MapElements.via(fn).withOutputType(td) to 
(Flat)MapElements.into(td).via(fn), which is both shorter and allows getting 
rid of the ugly intermediate and publicly-visible class 
MissingOutputTypeDescriptor.

R: @bjchambers 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/incubator-beam map-style

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2363.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2363


commit c52cddfaf15eb1d5df65bdcbdfb4ee6f167fe0f2
Author: Eugene Kirpichov 
Date:   2017-03-29T20:42:29Z

Removes MapElements.MissingOutputTypeDescriptor

This comes from changing MapElements.via(fn).withOutputType(td)
to MapElements.into(td).via(fn) which is also shorter.

commit 6ae80473f9924fe43c18e02f4a139f6421306c7f
Author: Eugene Kirpichov 
Date:   2017-03-29T20:56:00Z

Removes FlatMapElements.MissingOutputTypeDescriptor

This comes from changing FlatMapElements.via(fn).withOutputType(td)
to FlatMapElements.into(td).via(fn) which is also shorter.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Spark #1433

2017-03-29 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #2077: Improve documentation about Test Categories + Fix s...

2017-03-29 Thread iemejia
Github user iemejia closed the pull request at:

https://github.com/apache/beam/pull/2077


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1425) Window should comply with PTransform style guide

2017-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15947823#comment-15947823
 ] 

ASF GitHub Bot commented on BEAM-1425:
--

GitHub user jkff opened a pull request:

https://github.com/apache/beam/pull/2362

[BEAM-1425] Window should comply with PTransform style guide

Incompatible changes:

- Window.Bound class is now simply Window: matters for users that were 
extracting the transform into a variable.
- Static methods such as Window.triggering(), Window.withAllowedLateness() 
etc. are now available via Window.configure() - e.g. 
Window.configure().withAllowedLateness(...). The method Window.into() 
is left intact as it is the primary entry point for windowing a collection, 
whereas the other methods are for adjusting the parameters of the windowing 
function.

R: @tgroh 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/incubator-beam window-style

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2362.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2362


commit 100b65f5b5203e664386192bc390c34a51a8005b
Author: Eugene Kirpichov 
Date:   2017-03-29T00:59:11Z

Removes unused name parameter

commit f981f40e9f7ec48663dfa692bf44ca680cf2528d
Author: Eugene Kirpichov 
Date:   2017-03-29T01:04:37Z

Fixes a few warnings in Window

commit 9a0fa6617d929ff626bb292232327f79597d241d
Author: Eugene Kirpichov 
Date:   2017-03-29T01:14:30Z

Uses AutoValue in Window

commit 9de5bb2fde66fc0738aec770fcfc1846fcd2be95
Author: Eugene Kirpichov 
Date:   2017-03-29T19:58:20Z

Replaced static Window.blah() methods with Window.configure().blah() except 
Window.into()

commit 0ec3afc5530df0e5d1f7bbba4093ef16be8e3657
Author: Eugene Kirpichov 
Date:   2017-03-29T20:09:49Z

Replaces Window.Bound with simply Window




> Window should comply with PTransform style guide
> 
>
> Key: BEAM-1425
> URL: https://issues.apache.org/jira/browse/BEAM-1425
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Eugene Kirpichov
>  Labels: backward-incompatible, starter
> Fix For: First stable release
>
>
> Suggested changes:
> - Remove static builder-like methods such as triggering(), 
> discardingFiredPanes() - the only static entry point should be .into().
> - (optional) use AutoValue



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2362: [BEAM-1425] Window should comply with PTransform st...

2017-03-29 Thread jkff
GitHub user jkff opened a pull request:

https://github.com/apache/beam/pull/2362

[BEAM-1425] Window should comply with PTransform style guide

Incompatible changes:

- Window.Bound class is now simply Window: matters for users that were 
extracting the transform into a variable.
- Static methods such as Window.triggering(), Window.withAllowedLateness() 
etc. are now available via Window.configure() - e.g. 
Window.configure().withAllowedLateness(...). The method Window.into() 
is left intact as it is the primary entry point for windowing a collection, 
whereas the other methods are for adjusting the parameters of the windowing 
function.

R: @tgroh 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/incubator-beam window-style

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2362.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2362


commit 100b65f5b5203e664386192bc390c34a51a8005b
Author: Eugene Kirpichov 
Date:   2017-03-29T00:59:11Z

Removes unused name parameter

commit f981f40e9f7ec48663dfa692bf44ca680cf2528d
Author: Eugene Kirpichov 
Date:   2017-03-29T01:04:37Z

Fixes a few warnings in Window

commit 9a0fa6617d929ff626bb292232327f79597d241d
Author: Eugene Kirpichov 
Date:   2017-03-29T01:14:30Z

Uses AutoValue in Window

commit 9de5bb2fde66fc0738aec770fcfc1846fcd2be95
Author: Eugene Kirpichov 
Date:   2017-03-29T19:58:20Z

Replaced static Window.blah() methods with Window.configure().blah() except 
Window.into()

commit 0ec3afc5530df0e5d1f7bbba4093ef16be8e3657
Author: Eugene Kirpichov 
Date:   2017-03-29T20:09:49Z

Replaces Window.Bound with simply Window




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2361: Update Dataflow Worker Image

2017-03-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2361


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: This closes #2361

2017-03-29 Thread tgroh
This closes #2361


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/8fa11595
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/8fa11595
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/8fa11595

Branch: refs/heads/master
Commit: 8fa1159554630331a444943c102d17b59482a0eb
Parents: 8a33591 a3c87ef
Author: Thomas Groh 
Authored: Wed Mar 29 13:15:57 2017 -0700
Committer: Thomas Groh 
Committed: Wed Mar 29 13:15:57 2017 -0700

--
 runners/google-cloud-dataflow-java/pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--




[1/2] beam git commit: Update Dataflow Worker Image

2017-03-29 Thread tgroh
Repository: beam
Updated Branches:
  refs/heads/master 8a33591d9 -> 8fa115955


Update Dataflow Worker Image


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/a3c87ef6
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/a3c87ef6
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/a3c87ef6

Branch: refs/heads/master
Commit: a3c87ef6537e094c1568928bff7675c7dee49821
Parents: 8a33591
Author: Thomas Groh <tg...@google.com>
Authored: Wed Mar 29 12:26:14 2017 -0700
Committer: Thomas Groh <tg...@google.com>
Committed: Wed Mar 29 12:26:14 2017 -0700

--
 runners/google-cloud-dataflow-java/pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/a3c87ef6/runners/google-cloud-dataflow-java/pom.xml
--
diff --git a/runners/google-cloud-dataflow-java/pom.xml 
b/runners/google-cloud-dataflow-java/pom.xml
index b88574a..451edb9 100644
--- a/runners/google-cloud-dataflow-java/pom.xml
+++ b/runners/google-cloud-dataflow-java/pom.xml
@@ -33,7 +33,7 @@
   jar
 
   
-
beam-master-20170328
+    
beam-master-20170329
 
1
 
6
   



[GitHub] beam pull request #2361: Update Dataflow Worker Image

2017-03-29 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/2361

Update Dataflow Worker Image

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam update_worker_image

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2361.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2361


commit a3c87ef6537e094c1568928bff7675c7dee49821
Author: Thomas Groh 
Date:   2017-03-29T19:26:14Z

Update Dataflow Worker Image




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (BEAM-1589) Add OnWindowExpiration method to Stateful DoFn

2017-03-29 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-1589:
-

Assignee: Kenneth Knowles

> Add OnWindowExpiration method to Stateful DoFn
> --
>
> Key: BEAM-1589
> URL: https://issues.apache.org/jira/browse/BEAM-1589
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core, sdk-java-core
>Reporter: Jingsong Lee
>Assignee: Kenneth Knowles
>
> See BEAM-1517
> This allows the user to do some work before the state's garbage collection.
> It seems kind of annoying, but on the other hand forgetting to set a final 
> timer to flush state is probably data loss most of the time.
> FlinkRunner does this work very simply, but other runners, such as 
> DirectRunner, need to traverse all the states to do this, and maybe it's a 
> little hard.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-778) Make fileio._CompressedFile seekable.

2017-03-29 Thread Tibor Kiss (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15947739#comment-15947739
 ] 

Tibor Kiss commented on BEAM-778:
-

Thanks for the insights, [~katsia...@google.com] & [~chamikara].

As there is no immediate risk of concurrency issue I'd also opt for not 
extending _CompressedFile with a lock prematurely.
I'll make a comment about the (lack of) thread safety in _CompressedFile in the 
PR associated with this JIRA.

> Make fileio._CompressedFile seekable.
> -
>
> Key: BEAM-778
> URL: https://issues.apache.org/jira/browse/BEAM-778
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Reporter: Chamikara Jayalath
>Assignee: Tibor Kiss
> Fix For: Not applicable
>
>
> We have a TODO to make fileio._CompressedFile seekable.
> https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/fileio.py#L692
> Without this, compressed file objects produce for FileBasedSource 
> implementations may not be able to use libraries that utilize methods seek() 
> and tell().
> For example tarfile.open().



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-778) Make fileio._CompressedFile seekable.

2017-03-29 Thread Chamikara Jayalath (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15947718#comment-15947718
 ] 

Chamikara Jayalath commented on BEAM-778:
-

Currently this is not an issue since Beam FileBasedSoure and FileBasedSink are 
the only users of CompressedFile/File objects and they are used in a pretty 
straightforward way where each FileBasedSource/FileBasedSink object owns it's 
File/CompressedFile object and reading is done using a single thread. A 
secondary thread that performs dynamic work rebalancing might execute seek() 
operations for File objects but not for CompressedFile objects.

In the future we might have other places where we access CompressedFile objects 
using multiple thread but I think we should probably wait till such needs 
arise. Also it might be enough to declare CompressedFile objects to be not 
thread safe and expect users to address thread safety instead of embedding a 
lock in CompressedFile objects which would potentially add a performance 
penalty for all users.

WDYT ? 

> Make fileio._CompressedFile seekable.
> -
>
> Key: BEAM-778
> URL: https://issues.apache.org/jira/browse/BEAM-778
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Reporter: Chamikara Jayalath
>Assignee: Tibor Kiss
> Fix For: Not applicable
>
>
> We have a TODO to make fileio._CompressedFile seekable.
> https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/fileio.py#L692
> Without this, compressed file objects produce for FileBasedSource 
> implementations may not be able to use libraries that utilize methods seek() 
> and tell().
> For example tarfile.open().



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-447) Stop referring to types with Bound/Unbound

2017-03-29 Thread Eugene Kirpichov (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15947688#comment-15947688
 ] 

Eugene Kirpichov commented on BEAM-447:
---

Remaining instances: TextIO, AvroIO (handled by 
https://github.com/apache/beam/pull/1927), Window (handled by a 
soon-to-be-sent-for-review PR of mine), XmlSink, TFRecordIO.

> Stop referring to types with Bound/Unbound
> --
>
> Key: BEAM-447
> URL: https://issues.apache.org/jira/browse/BEAM-447
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Davor Bonaci
>  Labels: backward-incompatible
> Fix For: First stable release
>
>
> Bounded and Unbounded are used to refer to PCollections, and the overlap is 
> confusing.  These classes should be renamed to be both more specific (e.g. 
> ParDo.LackingDoFnSingleOutput, ParDo.SingleOutput, Window.AssignWindows) 
> which remove the overlap.
> examples:
> https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ParDo.java#L658
> https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ParDo.java#L868



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (BEAM-463) BoundedHeapCoder should be a StandardCoder and not a CustomCoder

2017-03-29 Thread Aviem Zur (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aviem Zur reassigned BEAM-463:
--

Assignee: (was: Aviem Zur)

> BoundedHeapCoder should be a StandardCoder and not a CustomCoder
> 
>
> Key: BEAM-463
> URL: https://issues.apache.org/jira/browse/BEAM-463
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Priority: Minor
>  Labels: backward-incompatible
>
> The issue is that BoundedHeapCoder does not report component encodings which 
> prevents effective runner inspection of the components.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-463) BoundedHeapCoder should be a StandardCoder and not a CustomCoder

2017-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15947666#comment-15947666
 ] 

ASF GitHub Bot commented on BEAM-463:
-

Github user aviemzur closed the pull request at:

https://github.com/apache/beam/pull/2108


> BoundedHeapCoder should be a StandardCoder and not a CustomCoder
> 
>
> Key: BEAM-463
> URL: https://issues.apache.org/jira/browse/BEAM-463
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Assignee: Aviem Zur
>Priority: Minor
>  Labels: backward-incompatible
>
> The issue is that BoundedHeapCoder does not report component encodings which 
> prevents effective runner inspection of the components.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2108: [BEAM-463] [BEAM-464] [BEAM-466] BoundedHeapCoder, ...

2017-03-29 Thread aviemzur
Github user aviemzur closed the pull request at:

https://github.com/apache/beam/pull/2108


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Closed] (BEAM-707) Migrate PubsubIO/PubsubUnboundedSource/PubsubUnboundedSink to use AutoValue to reduce boilerplate

2017-03-29 Thread Eugene Kirpichov (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Kirpichov closed BEAM-707.
-
   Resolution: Duplicate
Fix Version/s: Not applicable

> Migrate PubsubIO/PubsubUnboundedSource/PubsubUnboundedSink to use AutoValue 
> to reduce boilerplate
> -
>
> Key: BEAM-707
> URL: https://issues.apache.org/jira/browse/BEAM-707
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Priority: Minor
>  Labels: io, simple, starter
> Fix For: Not applicable
>
>
> Use the AutoValue functionality to reduce boilerplate.
> See this PR for an example:
> https://github.com/apache/incubator-beam/pull/1054



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (BEAM-710) Migrate Read/Write to use AutoValue to reduce boilerplate

2017-03-29 Thread Eugene Kirpichov (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Kirpichov closed BEAM-710.
-
   Resolution: Won't Fix
Fix Version/s: Not applicable

Not worth it - Read and Write have only 1 and 2 fields respectively.

> Migrate Read/Write to use AutoValue to reduce boilerplate
> -
>
> Key: BEAM-710
> URL: https://issues.apache.org/jira/browse/BEAM-710
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Priority: Minor
>  Labels: io, simple, starter
> Fix For: Not applicable
>
>
> Use the AutoValue functionality to reduce boilerplate.
> See this PR for an example:
> https://github.com/apache/incubator-beam/pull/1054



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (BEAM-709) Migrate CountingSource/CountingInput to use AutoValue to reduce boilerplate

2017-03-29 Thread Eugene Kirpichov (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Kirpichov closed BEAM-709.
-
   Resolution: Duplicate
Fix Version/s: Not applicable

> Migrate CountingSource/CountingInput to use AutoValue to reduce boilerplate
> ---
>
> Key: BEAM-709
> URL: https://issues.apache.org/jira/browse/BEAM-709
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Priority: Minor
>  Labels: io, simple, starter
> Fix For: Not applicable
>
>
> Use the AutoValue functionality to reduce boilerplate.
> See this PR for an example:
> https://github.com/apache/incubator-beam/pull/1054



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (BEAM-73) IO design pattern: Decouple Parsers and Coders

2017-03-29 Thread Eugene Kirpichov (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-73?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Kirpichov closed BEAM-73.

Resolution: Duplicate

The only remaining instance of this is in KafkaIO, handled by BEAM-1573.

> IO design pattern: Decouple Parsers and Coders
> --
>
> Key: BEAM-73
> URL: https://issues.apache.org/jira/browse/BEAM-73
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Daniel Halperin
>Priority: Minor
>  Labels: backward-incompatible
> Fix For: First stable release
>
>
> Many Sources can be thought of as providing a byte[] payload -- e.g. TextIO 
> bytes between newlines, or PubSubIO messages. Therefore, we originally 
> suggested a Coder as the thing to use to decode these byte[] into T (what 
> I'll call Parsing).
> Consider the case of a text file of integers.
> 123\n
> 456\n
> ...
> We want a PCollection out, so we can use TextualIntegerCoder with 
> TextIO.Read. However, that Coder will get propagated as the default coder for 
> that PCollection (and may be used in downstream DoFns). This seem bad as, 
> once the data is parsed, we probably want to use VarIntCoder or another Coder 
> that is more CPU- and Space-efficient.
> Another design pattern is
> TextIO.Read() -> MapElements (lambda s : 
> Integer.parseInt(s))
> This has better behavior, but now we go from byte[] to String to Integer 
> rather than directly from byte[] to Integer.
> The solution seems to be to explicitly add Parser and Coder abstractions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (BEAM-1269) BigtableIO should make more efficient use of connections

2017-03-29 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin updated BEAM-1269:
--
Labels: newbie starter  (was: )

> BigtableIO should make more efficient use of connections
> 
>
> Key: BEAM-1269
> URL: https://issues.apache.org/jira/browse/BEAM-1269
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-gcp
>Reporter: Daniel Halperin
>  Labels: newbie, starter
>
> RIght now, {{BigtableIO}} opens up a new Bigtable session for every DoFn, in 
> the {{@Setup}} function. However, sessions can support multiple connections, 
> so perhaps this code should be modified to open up a smaller session pool and 
> then allocation connections in {{@StartBundle}}.
> This would likely make more efficient use of resources, especially for highly 
> multithreaded workers.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-717) Change KafkaIO API to be consistent with style guide

2017-03-29 Thread Eugene Kirpichov (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Kirpichov resolved BEAM-717.
---
Resolution: Duplicate

> Change KafkaIO API to be consistent with style guide
> 
>
> Key: BEAM-717
> URL: https://issues.apache.org/jira/browse/BEAM-717
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Luke Cwik
>Assignee: Eugene Kirpichov
>Priority: Minor
>  Labels: backward-incompatible, io, simple, starter
> Fix For: 0.6.0
>
>
> Use the AutoValue functionality to reduce boilerplate.
> See this PR for an example:
> https://github.com/apache/incubator-beam/pull/1054



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Reopened] (BEAM-717) Change KafkaIO API to be consistent with style guide

2017-03-29 Thread Eugene Kirpichov (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Kirpichov reopened BEAM-717:
---

It remains to also rid it of using Coder's for serialization.

> Change KafkaIO API to be consistent with style guide
> 
>
> Key: BEAM-717
> URL: https://issues.apache.org/jira/browse/BEAM-717
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Luke Cwik
>Assignee: Eugene Kirpichov
>Priority: Minor
>  Labels: backward-incompatible, io, simple, starter
> Fix For: 0.6.0
>
>
> Use the AutoValue functionality to reduce boilerplate.
> See this PR for an example:
> https://github.com/apache/incubator-beam/pull/1054



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1269) BigtableIO should make more efficient use of connections

2017-03-29 Thread Solomon Duskis (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15947642#comment-15947642
 ] 

Solomon Duskis commented on BEAM-1269:
--

Cloud Bigtable client 0.9.6 was just released, and should be flowing through 
the maven repo process now.

This feature can be invoked via BigtableOptions.setUseCachedDataPool(true)

I have a follow up request to also set 
BigtableOptions.setDataHost(BigtableOptions.BIGTABLE_BATCH_DATA_HOST_DEFAULT) 
which will be a host dedicated to Batch type workloads like Dataflow.

> BigtableIO should make more efficient use of connections
> 
>
> Key: BEAM-1269
> URL: https://issues.apache.org/jira/browse/BEAM-1269
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-gcp
>Reporter: Daniel Halperin
>
> RIght now, {{BigtableIO}} opens up a new Bigtable session for every DoFn, in 
> the {{@Setup}} function. However, sessions can support multiple connections, 
> so perhaps this code should be modified to open up a smaller session pool and 
> then allocation connections in {{@StartBundle}}.
> This would likely make more efficient use of resources, especially for highly 
> multithreaded workers.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: beam_PerformanceTests_Dataflow #245

2017-03-29 Thread Apache Jenkins Server
See 


Changes:

[amitsela33] [BEAM-1827] Fix use of deprecated Spark APIs in the runner.

--
[...truncated 272.66 KB...]
 * [new ref] refs/pull/928/head -> origin/pr/928/head
 * [new ref] refs/pull/929/head -> origin/pr/929/head
 * [new ref] refs/pull/93/head -> origin/pr/93/head
 * [new ref] refs/pull/930/head -> origin/pr/930/head
 * [new ref] refs/pull/930/merge -> origin/pr/930/merge
 * [new ref] refs/pull/931/head -> origin/pr/931/head
 * [new ref] refs/pull/931/merge -> origin/pr/931/merge
 * [new ref] refs/pull/932/head -> origin/pr/932/head
 * [new ref] refs/pull/932/merge -> origin/pr/932/merge
 * [new ref] refs/pull/933/head -> origin/pr/933/head
 * [new ref] refs/pull/933/merge -> origin/pr/933/merge
 * [new ref] refs/pull/934/head -> origin/pr/934/head
 * [new ref] refs/pull/934/merge -> origin/pr/934/merge
 * [new ref] refs/pull/935/head -> origin/pr/935/head
 * [new ref] refs/pull/936/head -> origin/pr/936/head
 * [new ref] refs/pull/936/merge -> origin/pr/936/merge
 * [new ref] refs/pull/937/head -> origin/pr/937/head
 * [new ref] refs/pull/937/merge -> origin/pr/937/merge
 * [new ref] refs/pull/938/head -> origin/pr/938/head
 * [new ref] refs/pull/939/head -> origin/pr/939/head
 * [new ref] refs/pull/94/head -> origin/pr/94/head
 * [new ref] refs/pull/940/head -> origin/pr/940/head
 * [new ref] refs/pull/940/merge -> origin/pr/940/merge
 * [new ref] refs/pull/941/head -> origin/pr/941/head
 * [new ref] refs/pull/941/merge -> origin/pr/941/merge
 * [new ref] refs/pull/942/head -> origin/pr/942/head
 * [new ref] refs/pull/942/merge -> origin/pr/942/merge
 * [new ref] refs/pull/943/head -> origin/pr/943/head
 * [new ref] refs/pull/943/merge -> origin/pr/943/merge
 * [new ref] refs/pull/944/head -> origin/pr/944/head
 * [new ref] refs/pull/945/head -> origin/pr/945/head
 * [new ref] refs/pull/945/merge -> origin/pr/945/merge
 * [new ref] refs/pull/946/head -> origin/pr/946/head
 * [new ref] refs/pull/946/merge -> origin/pr/946/merge
 * [new ref] refs/pull/947/head -> origin/pr/947/head
 * [new ref] refs/pull/947/merge -> origin/pr/947/merge
 * [new ref] refs/pull/948/head -> origin/pr/948/head
 * [new ref] refs/pull/948/merge -> origin/pr/948/merge
 * [new ref] refs/pull/949/head -> origin/pr/949/head
 * [new ref] refs/pull/949/merge -> origin/pr/949/merge
 * [new ref] refs/pull/95/head -> origin/pr/95/head
 * [new ref] refs/pull/95/merge -> origin/pr/95/merge
 * [new ref] refs/pull/950/head -> origin/pr/950/head
 * [new ref] refs/pull/951/head -> origin/pr/951/head
 * [new ref] refs/pull/951/merge -> origin/pr/951/merge
 * [new ref] refs/pull/952/head -> origin/pr/952/head
 * [new ref] refs/pull/952/merge -> origin/pr/952/merge
 * [new ref] refs/pull/953/head -> origin/pr/953/head
 * [new ref] refs/pull/954/head -> origin/pr/954/head
 * [new ref] refs/pull/954/merge -> origin/pr/954/merge
 * [new ref] refs/pull/955/head -> origin/pr/955/head
 * [new ref] refs/pull/955/merge -> origin/pr/955/merge
 * [new ref] refs/pull/956/head -> origin/pr/956/head
 * [new ref] refs/pull/957/head -> origin/pr/957/head
 * [new ref] refs/pull/958/head -> origin/pr/958/head
 * [new ref] refs/pull/959/head -> origin/pr/959/head
 * [new ref] refs/pull/959/merge -> origin/pr/959/merge
 * [new ref] refs/pull/96/head -> origin/pr/96/head
 * [new ref] refs/pull/96/merge -> origin/pr/96/merge
 * [new ref] refs/pull/960/head -> origin/pr/960/head
 * [new ref] refs/pull/960/merge -> origin/pr/960/merge
 * [new ref] refs/pull/961/head -> origin/pr/961/head
 * [new ref] refs/pull/962/head -> origin/pr/962/head
 * [new ref] refs/pull/962/merge -> origin/pr/962/merge
 * [new ref] refs/pull/963/head -> origin/pr/963/head
 * [new ref] refs/pull/963/merge -> origin/pr/963/merge
 * [new ref] refs/pull/964/head -> origin/pr/964/head
 * [new ref] refs/pull/965/head -> origin/pr/965/head
 * [new ref] refs/pull/965/merge -> origin/pr/965/merge
 * [new ref] refs/pull/966/head -> origin/pr/966/head
 * [new ref] refs/pull/967/head -> origin/pr/967/head
 * [new ref] refs/pull/967/merge -> origin/pr/967/merge
 * [new ref] refs/pull/968/head -> origin/pr/968/head
 * [new ref] refs/pull/968/merge -> origin/pr/968/merge
 * [new ref] refs/pull/969/head -> origin/pr/969/head
 * [new ref] refs/pull/969/merge -> 

[jira] [Commented] (BEAM-778) Make fileio._CompressedFile seekable.

2017-03-29 Thread Konstantinos Katsiapis (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15947589#comment-15947589
 ] 

Konstantinos Katsiapis commented on BEAM-778:
-

In general the Beam programming model usually requires thread-compatibility 
(not the stronger thread-safety). I would hazard a guess that we should follow 
suite for _CompressedFile (ie no need for locking?). Having said that it might 
be worth documenting the thread-compatibility but [~chamikara] or [~altay] 
would have a better sense about overall documentation etc.

> Make fileio._CompressedFile seekable.
> -
>
> Key: BEAM-778
> URL: https://issues.apache.org/jira/browse/BEAM-778
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Reporter: Chamikara Jayalath
>Assignee: Tibor Kiss
> Fix For: Not applicable
>
>
> We have a TODO to make fileio._CompressedFile seekable.
> https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/fileio.py#L692
> Without this, compressed file objects produce for FileBasedSource 
> implementations may not be able to use libraries that utilize methods seek() 
> and tell().
> For example tarfile.open().



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1829) MQTT message compression not working on Rapsberry Pi

2017-03-29 Thread Vassil Kolarov (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15947573#comment-15947573
 ] 

Vassil Kolarov commented on BEAM-1829:
--

[~davor], sure. I'll test on a Raspberry tomorrow and submit a PR if everything 
is ok.

> MQTT message compression not working on Rapsberry Pi
> 
>
> Key: BEAM-1829
> URL: https://issues.apache.org/jira/browse/BEAM-1829
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Affects Versions: 0.6.0
>Reporter: Vassil Kolarov
>Assignee: Davor Bonaci
>  Labels: MQTT, Snappy
>
> Most probably due to this bug: 
> https://github.com/xerial/snappy-java/issues/147, the following exception is 
> raised, when running on Raspberry Pi:
> Exception in thread "main" java.lang.UnsatisfiedLinkError: 
> /root/~/tmp/snappy-1.1.2-3c6134d1-26c5-4fb0-b6c9-669d4848d15b-libsnappyjava.so:
>  /root/~/tmp/snappy-1.1.2-3c6134d1-26c5-4fb0-b6c9-669d4848d15b-libsn
> appyjava.so: cannot open shared object file: No such file or directory
> at java.lang.ClassLoader$NativeLibrary.load(Native Method)
> at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1929)
> at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1814)
> at java.lang.Runtime.load0(Runtime.java:809)
> at java.lang.System.load(System.java:1083)
> at 
> org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:174)
> at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:152)
> at org.xerial.snappy.Snappy.(Snappy.java:46)
> at 
> org.xerial.snappy.SnappyOutputStream.(SnappyOutputStream.java:97)
> at 
> org.xerial.snappy.SnappyOutputStream.(SnappyOutputStream.java:89)
> at 
> org.xerial.snappy.SnappyOutputStream.(SnappyOutputStream.java:79)
> at 
> org.apache.beam.sdk.util.SerializableUtils.serializeToByteArray(SerializableUtils.java:48)
> at 
> org.apache.beam.sdk.util.SerializableUtils.ensureSerializable(SerializableUtils.java:83)
> at org.apache.beam.sdk.io.Read$Unbounded.(Read.java:141)
> at org.apache.beam.sdk.io.Read$Unbounded.(Read.java:136)
> at org.apache.beam.sdk.io.Read.from(Read.java:56)
> at org.apache.beam.sdk.io.mqtt.MqttIO$Read.expand(MqttIO.java:274)
> at org.apache.beam.sdk.io.mqtt.MqttIO$Read.expand(MqttIO.java:221)
> at 
> org.apache.beam.sdk.runners.PipelineRunner.apply(PipelineRunner.java:76)
> at 
> org.apache.beam.runners.direct.DirectRunner.apply(DirectRunner.java:296)
> at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:388)
> at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:302)
> at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:47)
> at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:152)
> at org.blah.beam.MqttPipeline.main(MqttPipeline.java:37)
> Increasing the snappy version to 1.1.4 will probably fix the issue.
> Best regards,
> Vassil



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-778) Make fileio._CompressedFile seekable.

2017-03-29 Thread Tibor Kiss (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15947561#comment-15947561
 ] 

Tibor Kiss commented on BEAM-778:
-

I'm still working on seek() implementation and I have noticed that there is no 
lock to protect the {{_read_buffer}} object. 
I'm not completely sure if it is a valid scenario that multiple threads 
accessing the same _CompressedFile object though.

Any thoughts on extending this class with a lock on {{_read_buffer}}?

> Make fileio._CompressedFile seekable.
> -
>
> Key: BEAM-778
> URL: https://issues.apache.org/jira/browse/BEAM-778
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Reporter: Chamikara Jayalath
>Assignee: Tibor Kiss
> Fix For: Not applicable
>
>
> We have a TODO to make fileio._CompressedFile seekable.
> https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/fileio.py#L692
> Without this, compressed file objects produce for FileBasedSource 
> implementations may not be able to use libraries that utilize methods seek() 
> and tell().
> For example tarfile.open().



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2360: Remove View indicator classes

2017-03-29 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/2360

Remove View indicator classes

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
These indicator classes have been replaced everywhere by
referencing the available ViewFns.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam non_visible_view_classes

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2360.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2360






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2359: Needs runner

2017-03-29 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/beam/pull/2359

Needs runner

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/beam NeedsRunner

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2359.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2359


commit b9fd930da01a15f01dfa782b278355e036574d66
Author: Kenneth Knowles 
Date:   2017-03-29T16:53:19Z

Remove NeedsRunner category tags from non-core libraries

commit 7ad1d32d6bc53ee5df89c7a08d45616e4f8aee5b
Author: Kenneth Knowles 
Date:   2017-03-29T16:57:27Z

Remove ValidatesRunner category tags from non-core libraries




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-818) Remove Pipeline.getPipelineOptions

2017-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15947487#comment-15947487
 ] 

ASF GitHub Bot commented on BEAM-818:
-

GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/2358

[BEAM-818] Mark Pipeline#getPipelineOptions Experimental

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
BEAM-818 plans to remove this method when it is not used within the Beam
SDK. Users should not use it, and marking Experimental as well as
Deprecated permits its removal within a major version.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam experimental_getPipelineOptions

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2358.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2358


commit 7975bf507c6f2b8edb1b3168231f13ae9b0f5dc8
Author: Thomas Groh 
Date:   2017-03-29T16:45:15Z

Mark Pipeline#getPipelineOptions Experimental

BEAM-818 plans to remove this method when it is not used within the Beam
SDK. Users should not use it, and marking Experimental as well as
Deprecated permits its removal within a major version.




> Remove Pipeline.getPipelineOptions
> --
>
> Key: BEAM-818
> URL: https://issues.apache.org/jira/browse/BEAM-818
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Thomas Groh
>  Labels: backward-incompatible
> Fix For: First stable release
>
>
> This stops transforms from changing their operation based on 
> construction-time options, and instead requires that configuration to be 
> explicit, or to obtain the configuration at runtime.
> https://docs.google.com/document/d/1Wr05cYdqnCfrLLqSk--XmGMGgDwwNwWZaFbxLKvPqEQ/edit#



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2358: [BEAM-818] Mark Pipeline#getPipelineOptions Experim...

2017-03-29 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/2358

[BEAM-818] Mark Pipeline#getPipelineOptions Experimental

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
BEAM-818 plans to remove this method when it is not used within the Beam
SDK. Users should not use it, and marking Experimental as well as
Deprecated permits its removal within a major version.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam experimental_getPipelineOptions

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2358.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2358


commit 7975bf507c6f2b8edb1b3168231f13ae9b0f5dc8
Author: Thomas Groh 
Date:   2017-03-29T16:45:15Z

Mark Pipeline#getPipelineOptions Experimental

BEAM-818 plans to remove this method when it is not used within the Beam
SDK. Users should not use it, and marking Experimental as well as
Deprecated permits its removal within a major version.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1308) Consider removing TimeDomain from user APIs

2017-03-29 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15947481#comment-15947481
 ] 

Kenneth Knowles commented on BEAM-1308:
---

Marking backward-incompatible, but also intending for this whole nascent region 
of the API to be marked Experimental for the first stable release.

> Consider removing TimeDomain from user APIs
> ---
>
> Key: BEAM-1308
> URL: https://issues.apache.org/jira/browse/BEAM-1308
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>  Labels: backward-incompatible
>
> Today the user declares what TimeDomain they want to have their timer in. 
> There are three time domains, and one of them is not really suitable for 
> users.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (BEAM-26) Model for validation of user state upon pipeline update

2017-03-29 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-26?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-26:
---

Assignee: (was: Kenneth Knowles)

> Model for validation of user state upon pipeline update
> ---
>
> Key: BEAM-26
> URL: https://issues.apache.org/jira/browse/BEAM-26
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, beam-model-runner-api
>Reporter: Kenneth Knowles
>  Labels: State
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (BEAM-1308) Consider removing TimeDomain from user APIs

2017-03-29 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-1308:
--
Labels: backward-incompatible  (was: )

> Consider removing TimeDomain from user APIs
> ---
>
> Key: BEAM-1308
> URL: https://issues.apache.org/jira/browse/BEAM-1308
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>  Labels: backward-incompatible
>
> Today the user declares what TimeDomain they want to have their timer in. 
> There are three time domains, and one of them is not really suitable for 
> users.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (BEAM-1308) Consider removing TimeDomain from user APIs

2017-03-29 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-1308:
-

Assignee: (was: Kenneth Knowles)

> Consider removing TimeDomain from user APIs
> ---
>
> Key: BEAM-1308
> URL: https://issues.apache.org/jira/browse/BEAM-1308
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>  Labels: backward-incompatible
>
> Today the user declares what TimeDomain they want to have their timer in. 
> There are three time domains, and one of them is not really suitable for 
> users.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1832) Potentially unclosed OutputStream in ApexYarnLauncher

2017-03-29 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15947383#comment-15947383
 ] 

Ted Yu commented on BEAM-1832:
--

In the same class:
{code}
  public static List getYarnDeployDependencies() throws IOException {
InputStream dependencyTree = 
ApexRunner.class.getResourceAsStream("dependency-tree");
{code}
dependencyTree should be closed even if exception is thrown in the while loop.

> Potentially unclosed OutputStream in ApexYarnLauncher
> -
>
> Key: BEAM-1832
> URL: https://issues.apache.org/jira/browse/BEAM-1832
> Project: Beam
>  Issue Type: Bug
>  Components: runner-apex
>Reporter: Ted Yu
>Priority: Minor
>
> Here is an example from createJar():
> {code}
>   final OutputStream out = 
> Files.newOutputStream(zipfs.getPath(JarFile.MANIFEST_NAME));
>   if (!manifestFile.exists()) {
> new Manifest().write(out);
>   } else {
> FileUtils.copyFile(manifestFile, out);
>   }
>   out.close();
> {code}
> If FileUtils.copyFile throws IOException, out would be left unclosed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-1832) Potentially unclosed OutputStream in ApexYarnLauncher

2017-03-29 Thread Ted Yu (JIRA)
Ted Yu created BEAM-1832:


 Summary: Potentially unclosed OutputStream in ApexYarnLauncher
 Key: BEAM-1832
 URL: https://issues.apache.org/jira/browse/BEAM-1832
 Project: Beam
  Issue Type: Bug
  Components: runner-apex
Reporter: Ted Yu
Priority: Minor


Here is an example from createJar():
{code}
  final OutputStream out = 
Files.newOutputStream(zipfs.getPath(JarFile.MANIFEST_NAME));
  if (!manifestFile.exists()) {
new Manifest().write(out);
  } else {
FileUtils.copyFile(manifestFile, out);
  }
  out.close();
{code}
If FileUtils.copyFile throws IOException, out would be left unclosed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (BEAM-1831) Checking of containment in createdTables may have race condition in StreamingWriteFn

2017-03-29 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin updated BEAM-1831:
--
Component/s: (was: runner-dataflow)
 sdk-java-gcp

> Checking of containment in createdTables may have race condition in 
> StreamingWriteFn
> 
>
> Key: BEAM-1831
> URL: https://issues.apache.org/jira/browse/BEAM-1831
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Ted Yu
>Assignee: Davor Bonaci
>Priority: Minor
>
> {code}
>   public TableReference getOrCreateTable(BigQueryOptions options, String 
> tableSpec)
>   throws InterruptedException, IOException {
> TableReference tableReference = BigQueryHelpers.parseTableSpec(tableSpec);
> if (createDisposition != createDisposition.CREATE_NEVER
> && !createdTables.contains(tableSpec)) {
>   synchronized (createdTables) {
> // Another thread may have succeeded in creating the table in the 
> meanwhile, so
> // check again. This check isn't needed for correctness, but we add 
> it to prevent
> // every thread from attempting a create and overwhelming our 
> BigQuery quota.
> DatasetService datasetService = bqServices.getDatasetService(options);
> if (!createdTables.contains(tableSpec)) {
> {code}
> The first createdTables.contains() check is outside synchronized block.
> At least createdTables should be declared volatile for the double checked 
> locking to work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (BEAM-190) Dead-letter drop for bad BigQuery records

2017-03-29 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin reassigned BEAM-190:


Assignee: Reuven Lax

> Dead-letter drop for bad BigQuery records
> -
>
> Key: BEAM-190
> URL: https://issues.apache.org/jira/browse/BEAM-190
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Mark Shields
>Assignee: Reuven Lax
>
> If a BigQuery insert fails for data-specific rather than structural reasons 
> (eg cannot parse a date) then the bundle will be retried indefinitely, first 
> by BigQueryTableInserter.insertAll then by the overall production retry logic 
> of the underlying runner.
> Better would be to allow customer to specify a dead-letter store for records 
> such as those so that overall processing can continue while bad records are 
> quarantined.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1826) Allow BigqueryIO to forward errors

2017-03-29 Thread Daniel Halperin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15947374#comment-15947374
 ] 

Daniel Halperin commented on BEAM-1826:
---

This sounds like [BEAM-190] to me -- is that accurate?

> Allow BigqueryIO to forward errors
> --
>
> Key: BEAM-1826
> URL: https://issues.apache.org/jira/browse/BEAM-1826
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Kevin Peterson
>Assignee: Daniel Halperin
>Priority: Minor
>
> Most sinks are terminal - data ends at the sink. While on occasion the sink 
> may temporarily fail due to resource unavailability, it will eventually 
> succeed. However, some have strict requirements on this input format. In 
> these cases, retries will never succeed, and continuous retrying will 
> eventually lead to pipeline failure.
> The primary use case I have in mind is streaming data to a sink such as 
> BigQuery, where data of the wrong format could fail on insert.
> It would be useful to be able to set a side output or downstream transform 
> from Bigquery which can receive failed rows where retry will never fix the 
> issue, and allow them to be persisted to a different output which is more 
> permissive of the output, to prevent data loss.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-1831) Checking of containment in createdTables may have race condition in StreamingWriteFn

2017-03-29 Thread Ted Yu (JIRA)
Ted Yu created BEAM-1831:


 Summary: Checking of containment in createdTables may have race 
condition in StreamingWriteFn
 Key: BEAM-1831
 URL: https://issues.apache.org/jira/browse/BEAM-1831
 Project: Beam
  Issue Type: Bug
  Components: runner-dataflow
Reporter: Ted Yu
Assignee: Davor Bonaci
Priority: Minor


{code}
  public TableReference getOrCreateTable(BigQueryOptions options, String 
tableSpec)
  throws InterruptedException, IOException {
TableReference tableReference = BigQueryHelpers.parseTableSpec(tableSpec);
if (createDisposition != createDisposition.CREATE_NEVER
&& !createdTables.contains(tableSpec)) {
  synchronized (createdTables) {
// Another thread may have succeeded in creating the table in the 
meanwhile, so
// check again. This check isn't needed for correctness, but we add it 
to prevent
// every thread from attempting a create and overwhelming our BigQuery 
quota.
DatasetService datasetService = bqServices.getDatasetService(options);
if (!createdTables.contains(tableSpec)) {
{code}
The first createdTables.contains() check is outside synchronized block.
At least createdTables should be declared volatile for the double checked 
locking to work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (BEAM-1829) MQTT message compression not working on Rapsberry Pi

2017-03-29 Thread Davor Bonaci (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15947334#comment-15947334
 ] 

Davor Bonaci edited comment on BEAM-1829 at 3/29/17 3:23 PM:
-

[~vassil], do you perhaps want to submit a PR to fix this? Would be much 
appreciated!

CC: [~jbonofre].


was (Author: davor):
[~vassil], do you perhaps want to submit a PR to fix this? Would be much 
appreciated!

> MQTT message compression not working on Rapsberry Pi
> 
>
> Key: BEAM-1829
> URL: https://issues.apache.org/jira/browse/BEAM-1829
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Affects Versions: 0.6.0
>Reporter: Vassil Kolarov
>Assignee: Davor Bonaci
>  Labels: MQTT, Snappy
>
> Most probably due to this bug: 
> https://github.com/xerial/snappy-java/issues/147, the following exception is 
> raised, when running on Raspberry Pi:
> Exception in thread "main" java.lang.UnsatisfiedLinkError: 
> /root/~/tmp/snappy-1.1.2-3c6134d1-26c5-4fb0-b6c9-669d4848d15b-libsnappyjava.so:
>  /root/~/tmp/snappy-1.1.2-3c6134d1-26c5-4fb0-b6c9-669d4848d15b-libsn
> appyjava.so: cannot open shared object file: No such file or directory
> at java.lang.ClassLoader$NativeLibrary.load(Native Method)
> at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1929)
> at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1814)
> at java.lang.Runtime.load0(Runtime.java:809)
> at java.lang.System.load(System.java:1083)
> at 
> org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:174)
> at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:152)
> at org.xerial.snappy.Snappy.(Snappy.java:46)
> at 
> org.xerial.snappy.SnappyOutputStream.(SnappyOutputStream.java:97)
> at 
> org.xerial.snappy.SnappyOutputStream.(SnappyOutputStream.java:89)
> at 
> org.xerial.snappy.SnappyOutputStream.(SnappyOutputStream.java:79)
> at 
> org.apache.beam.sdk.util.SerializableUtils.serializeToByteArray(SerializableUtils.java:48)
> at 
> org.apache.beam.sdk.util.SerializableUtils.ensureSerializable(SerializableUtils.java:83)
> at org.apache.beam.sdk.io.Read$Unbounded.(Read.java:141)
> at org.apache.beam.sdk.io.Read$Unbounded.(Read.java:136)
> at org.apache.beam.sdk.io.Read.from(Read.java:56)
> at org.apache.beam.sdk.io.mqtt.MqttIO$Read.expand(MqttIO.java:274)
> at org.apache.beam.sdk.io.mqtt.MqttIO$Read.expand(MqttIO.java:221)
> at 
> org.apache.beam.sdk.runners.PipelineRunner.apply(PipelineRunner.java:76)
> at 
> org.apache.beam.runners.direct.DirectRunner.apply(DirectRunner.java:296)
> at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:388)
> at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:302)
> at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:47)
> at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:152)
> at org.blah.beam.MqttPipeline.main(MqttPipeline.java:37)
> Increasing the snappy version to 1.1.4 will probably fix the issue.
> Best regards,
> Vassil



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1829) MQTT message compression not working on Rapsberry Pi

2017-03-29 Thread Davor Bonaci (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15947334#comment-15947334
 ] 

Davor Bonaci commented on BEAM-1829:


[~vassil], do you perhaps want to submit a PR to fix this? Would be much 
appreciated!

> MQTT message compression not working on Rapsberry Pi
> 
>
> Key: BEAM-1829
> URL: https://issues.apache.org/jira/browse/BEAM-1829
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Affects Versions: 0.6.0
>Reporter: Vassil Kolarov
>Assignee: Davor Bonaci
>  Labels: MQTT, Snappy
>
> Most probably due to this bug: 
> https://github.com/xerial/snappy-java/issues/147, the following exception is 
> raised, when running on Raspberry Pi:
> Exception in thread "main" java.lang.UnsatisfiedLinkError: 
> /root/~/tmp/snappy-1.1.2-3c6134d1-26c5-4fb0-b6c9-669d4848d15b-libsnappyjava.so:
>  /root/~/tmp/snappy-1.1.2-3c6134d1-26c5-4fb0-b6c9-669d4848d15b-libsn
> appyjava.so: cannot open shared object file: No such file or directory
> at java.lang.ClassLoader$NativeLibrary.load(Native Method)
> at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1929)
> at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1814)
> at java.lang.Runtime.load0(Runtime.java:809)
> at java.lang.System.load(System.java:1083)
> at 
> org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:174)
> at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:152)
> at org.xerial.snappy.Snappy.(Snappy.java:46)
> at 
> org.xerial.snappy.SnappyOutputStream.(SnappyOutputStream.java:97)
> at 
> org.xerial.snappy.SnappyOutputStream.(SnappyOutputStream.java:89)
> at 
> org.xerial.snappy.SnappyOutputStream.(SnappyOutputStream.java:79)
> at 
> org.apache.beam.sdk.util.SerializableUtils.serializeToByteArray(SerializableUtils.java:48)
> at 
> org.apache.beam.sdk.util.SerializableUtils.ensureSerializable(SerializableUtils.java:83)
> at org.apache.beam.sdk.io.Read$Unbounded.(Read.java:141)
> at org.apache.beam.sdk.io.Read$Unbounded.(Read.java:136)
> at org.apache.beam.sdk.io.Read.from(Read.java:56)
> at org.apache.beam.sdk.io.mqtt.MqttIO$Read.expand(MqttIO.java:274)
> at org.apache.beam.sdk.io.mqtt.MqttIO$Read.expand(MqttIO.java:221)
> at 
> org.apache.beam.sdk.runners.PipelineRunner.apply(PipelineRunner.java:76)
> at 
> org.apache.beam.runners.direct.DirectRunner.apply(DirectRunner.java:296)
> at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:388)
> at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:302)
> at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:47)
> at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:152)
> at org.blah.beam.MqttPipeline.main(MqttPipeline.java:37)
> Increasing the snappy version to 1.1.4 will probably fix the issue.
> Best regards,
> Vassil



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (BEAM-1829) MQTT message compression not working on Rapsberry Pi

2017-03-29 Thread Davor Bonaci (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davor Bonaci updated BEAM-1829:
---
Component/s: (was: sdk-java-core)
 sdk-java-extensions

> MQTT message compression not working on Rapsberry Pi
> 
>
> Key: BEAM-1829
> URL: https://issues.apache.org/jira/browse/BEAM-1829
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Affects Versions: 0.6.0
>Reporter: Vassil Kolarov
>Assignee: Davor Bonaci
>  Labels: MQTT, Snappy
>
> Most probably due to this bug: 
> https://github.com/xerial/snappy-java/issues/147, the following exception is 
> raised, when running on Raspberry Pi:
> Exception in thread "main" java.lang.UnsatisfiedLinkError: 
> /root/~/tmp/snappy-1.1.2-3c6134d1-26c5-4fb0-b6c9-669d4848d15b-libsnappyjava.so:
>  /root/~/tmp/snappy-1.1.2-3c6134d1-26c5-4fb0-b6c9-669d4848d15b-libsn
> appyjava.so: cannot open shared object file: No such file or directory
> at java.lang.ClassLoader$NativeLibrary.load(Native Method)
> at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1929)
> at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1814)
> at java.lang.Runtime.load0(Runtime.java:809)
> at java.lang.System.load(System.java:1083)
> at 
> org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:174)
> at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:152)
> at org.xerial.snappy.Snappy.(Snappy.java:46)
> at 
> org.xerial.snappy.SnappyOutputStream.(SnappyOutputStream.java:97)
> at 
> org.xerial.snappy.SnappyOutputStream.(SnappyOutputStream.java:89)
> at 
> org.xerial.snappy.SnappyOutputStream.(SnappyOutputStream.java:79)
> at 
> org.apache.beam.sdk.util.SerializableUtils.serializeToByteArray(SerializableUtils.java:48)
> at 
> org.apache.beam.sdk.util.SerializableUtils.ensureSerializable(SerializableUtils.java:83)
> at org.apache.beam.sdk.io.Read$Unbounded.(Read.java:141)
> at org.apache.beam.sdk.io.Read$Unbounded.(Read.java:136)
> at org.apache.beam.sdk.io.Read.from(Read.java:56)
> at org.apache.beam.sdk.io.mqtt.MqttIO$Read.expand(MqttIO.java:274)
> at org.apache.beam.sdk.io.mqtt.MqttIO$Read.expand(MqttIO.java:221)
> at 
> org.apache.beam.sdk.runners.PipelineRunner.apply(PipelineRunner.java:76)
> at 
> org.apache.beam.runners.direct.DirectRunner.apply(DirectRunner.java:296)
> at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:388)
> at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:302)
> at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:47)
> at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:152)
> at org.blah.beam.MqttPipeline.main(MqttPipeline.java:37)
> Increasing the snappy version to 1.1.4 will probably fix the issue.
> Best regards,
> Vassil



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-1827) Fix use of deprecated Spark APIs in the runner.

2017-03-29 Thread Amit Sela (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Sela resolved BEAM-1827.
-
   Resolution: Fixed
Fix Version/s: First stable release

> Fix use of deprecated Spark APIs in the runner. 
> 
>
> Key: BEAM-1827
> URL: https://issues.apache.org/jira/browse/BEAM-1827
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Amit Sela
>Assignee: Amit Sela
>Priority: Minor
> Fix For: First stable release
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1827) Fix use of deprecated Spark APIs in the runner.

2017-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15947316#comment-15947316
 ] 

ASF GitHub Bot commented on BEAM-1827:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2354


> Fix use of deprecated Spark APIs in the runner. 
> 
>
> Key: BEAM-1827
> URL: https://issues.apache.org/jira/browse/BEAM-1827
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Amit Sela
>Assignee: Amit Sela
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


  1   2   >